



STATISTICS : 
THEORY AND PRACTICE 


BY 

M. K. GHOSH, M.A.. B. COM., (Lontl.) a.m. inst. t. 

Utiiverifify Professor and Head of the Department 
of Commerce. C niversitij of Allahabad. 

Author: Transport Development 
and Co-ordination. 

Joint Author: Insurance Principles, 

Practice and Lerjislation 

AND 

S. C. CHAI DHRI. M.A., B.coM. 

Lecturer in the De partment of Com merer, University 
of Allahabad. Allahabad Jubilee Gold 
Medalist and Ed-Research SeJudar, 

Allahabad Unix^ersii y 


First 


Allahabad 

THE INDIAN PRESS LIMITED 
194.3 



Published hti 

K. Mittra, at the Indian Press Ltd., 
Allahabad. 


Printed bii 

3. N. Bose, at the ladiao Preas Ltd 
Calcutta. 



PREFACE 


Stjiiistics was once known as Ihe Soicnct* oi Kings, but now it 
bas gained ground in almost every branch of liutsan knowledge. For, 
the superstructure of human activity rests ultimately, if not 
primarily, upon a foundation of quantitative facts — facts, whose 
inherent complexity and confusion can be simplifietl and analyzed 
and wdiieh ean be interpreted only wutli a knowledge of statistical 
methods, fn this Age of Statistics, therefore, the importance of 
the study of the Science of Statistics cannot be over-emphasized, 
particularly for India whose development, in many spheres, is yet 
in its infancy. Indeed, the importance is being recognized, and 
the Indian universities have taken the lead in the matter. 
Naturally, the necessity of a suitable text-b<K>k on the subject 
for Indian students is nuire than made out. 

This book is an attempt to furnish a simple, but com- 
prehensive. text for those who desire to t^qiiip tliemselves with a 
knowledge of tlie elementary statistical methods to enable 
themselves to handle statistical problems like skilled workmen. 
It is, of eourse. primarily intemled for the benefit of those 
interested in Economies, f^unmeree. Sociology or Administration, 
but the general principles it comprises of, will be suiteil equally 
well to every other \'arie.ty of statistical data. 

In a book such a', this, the use of the viewpoints and materials 
of other works of tlie parallel and higher standards is 
unavoidable. And. indeed, such works have l>een our valuable 
guide. Hut we have made every effort to so synthesize all these 
maUfcrials as to bring about unity and harratmy. The treatment is 
non mathematical, chiefly because a majority of those for whom 
this book is primarily meant are not expert mathematicians, and 
also because we feel there is a necessity of fundamental exposition 
of the non-mathematieal. nonetheless vital, processes involved in 
statistical inquiries, analysis and interpretation. Naturally 
therefore, a fuller discussion of topics like Probability, Sampling, 
Regression, etc., which require mathematical treatment, could not 
be included. Once the readers have overcome their feeling of 
unfamiliarity and grasped the basic principles, it will be easy 
for them to pick up the higher and more mathemattical statistics. 
The discussion on Statistical ^4sterial in India and on Indian 



iv 


Index Numbers does not pretend to be exhaustive, but is designed 
to make the Indian student look around him. Speeial eare has 
been taken to seleet exereises for eaeh ehapter suited to the 
M. Com.. M.A., and 1^. Cora., standards ot different universities of 
India. 

Our thanks are due to Mr. Shiam Baliadur Kodasi. M.A., 
B. ('om., who helped us in eorreeting the proofs. We shall be 
thankful for any suggestion to inerease the ust fulne ss of the book. 


Depaht.mknt of Com MERCK. 
University of Aelahabad. 
Xovf mber, lO^S. 


'I MOHIT KI MAK 
I SUSHEKL ‘ HANDKA rHAl DHKl 



CONTENTS 


Page* 


CHAPTER I 

J GROWTH OF THE SCIENCE OF STATISTICS 

Origin; Mercantilistic Period; 16th Century; 17th Cen- 
tury; 18th Century — Statistics and Mathematics; 

19th Century — Statistics and Economics. Exercises 1 — 8 

CHAPTER n 

'^DEFINITION OF STATISTICS / 

Definition of Statistics (data); Characteristics of 
Statistics; Statistical Methods; Science of Statis- 
tics defined; Functions of a Statistician; Main 
Divisions of Statistics. Exercises . . . . 9 — 18 

CHAPTER- m 

FUNCTIONS AND IMPORTANCE OF STATISTICS 

Functions of Statistics; Importance of Statistics; 

Limitations of Statistics; Distrust of Statistics. 

Exercises . . , . . , . . 19 — 31 


i CHAPTER ZV 

STATISTICAL INQUIRIES AND UNITS 

Types of Statistical Inquiries; Units of Measurement; 

Simple and Composite Units, and Coefficients. 

, Exercises 


82—38 



Vi 


CHAPTEE V 

7 

COLLECTION OF STATISTICAL DATA 

Primary and Secondary Data; Primary Method — 1. 

Direct Personal Investigation, 2. Indirect Oral 
Investigation, 3. Estimates from local sources or 
Correspondents, 4. Inv'estigation through scliedules 
to be filled by the Informants, 5. Investigation 
through schedules in charge of Enumerators; 

Choice of Enumerators; Choice of Questions; 

Selection of Representative Data, Theory of Pro- 
bability and Law of Inertia of Large Numbers; 

Secondary Method — L Utilizing Published infor- 
mation, 2. Utilization of Business Intelligence 
Service bulletins, 3. Utilization of Unpublished 
data or manuscripts, 4, Utilizing information 
collected by other agencies or for other purposes. 

Exercises ••• 89—50 


CHAPTEB VI 

'^EDITING THE COLLECTED DATA 

Editing Primary Data; Accuracy; Statistical Errors; 
Measurement of Error; Biassed and Unbiassed 
Errors; Approximation; Editing Secondary Data. 

Exercises . . . . . * . . 51 — 60 


CHAPTER VII 


V 


7 STATISTICAL MATERIAL IN INDIA 


Chief Sources; Short-comings of Official Statistics; 
Examination of some official statistics — Statistical 
Abstract of British India, Acrricultural Statistics, ^ 
Prices, Wages and Cost of Living, Trade Statistics, 
The Census Reports, Vital Statistics. Exercises 



61—82 



vii 


✓OHAPTEE Vm / 

CLASSIFICATION AND TABULATION OF DATA 

Classification — Classification according to Attributes; 
Classification according to Class-intervals; Statisti- 
cal Series; Time, Spatial and Condition Series; 

Continuous and Discrete Series; Tabulation — 

Rules and Precautions for Tabulation; Different 

types of Tabulation. Exercises . . . . 83 — 98 

CHAPTEE IX 

^SIMPLE DERIVATIVES 

Derivatives defined; Subordinate Derivatives; Co- 
ordinate Derivatives — 1. The Simple Difference, 

2. The Percentage Difference, 3. The Ratio, 4. The 
Rale ; Purpose of Computing Statistical Derivatives ; 

Derivative Series ; Rules and Precautions for 
Computing Derivatives; Ratios; Use of Simple 
Derivatives. Exercises . . . . 99 — 107 

CHAPTEE X 

A / 

STATISTICAL AVERAGES 

Average defined; Homogeniety of Data; Kinds of 
Average: The Mode — Location, Adv., Disadv.. 

Uses; The Median — Determination, Adv., Disadv., 

Uses; Quartiles, Deciles & Percentiles — Loca- 
tion, Characteristics; The Arithmetic Average — 

Simple Average, Measurement by Direct and Short- 
cut Methods, Adv., Disadv., Uses; Weighted 
Average, When should Weighted Average be used? 

The Geometric Average — Determination, Weight- 
ed Geometric Mean, Adv., Disadv., Uses; 'I^e 
Harmonic Average — Determination, Characteris- 
tics and Uses; Averages of the First Order; Typical 
and Descriptiv'^ Averages; Choice of Averages; 

Limitations of Averages; Standardized Death Rate. 

Exercises . . . . , . , . lOg — 169 



viii 

CHAPTEE Xr 

DISPERSION AND SKEWNESS 

Dispersion — e a n i n g ; Measures of Dispersion : 

Method of Limits — The Range and its Co- 
efficient; Method of Averaging Deviations — 

— (1) First Moment of Dispersion or 
Average Deviation and its Coefficient, their 
Calculation, Characteristics and Uses, (2) 

Second Moment of Dispersion, Standard Deviation 
and its Coefficient, Their CaJeuIation by Direct 
and Short-cut Methods, Characteristics and Uses, 

Modulus, Variance, Coefficient of Variation, (3) 

Quartile Range and its Coefficient, Their Calcula- 
tion, Characteristics and Uses; Choice of Measures 
of Dispersion; Absolute and Relative Measures of 
Dispersion; Relation Between Measures of Disper- 
sion; Lorenz Curve; Practical Utility of Measures 
of Dispersion; Skewness — Tests of Skewness; 

Measures of Skewness; First Measure and Coeffi- 
cient of Skewness, Second Measure and Coefficient 
of Skewness; Positive and Negative Skewness; 

Dispersion and Skewness contrasted. Exercises . . 170 — 202 


^ CHAPTER Xn 

^NDEX NUMBERS 

Definition; Fluctuations in General Price Level; Con- 
struction OF Index NuaMBErs of Prices: Selec- 
tion of items; Choice of Base — Fixed* Base Method 
and Chain Base Mehod; Type of Average to Ini 
used — Arithmetic Mean, Median and Geometric 
Mean, Chain Relatives, Reversibility of Index 
Numbers, Base Shifting; The System of Weighting 
— Implicit and Explicit Weighting, MethcKls of 
Weighting — Weighted Average of Relatives, Aggre- 
gative Method, Fisher’s Ideal Formula — Time Re- 
versal Test and Factor Reversal Test; Summary 
and General Remarks; Cost of Living Index 
Numbers — Difficulties in Construction, Construc- 
tion, Aggregate Expenditure Metliod, Family 



ix 

Budget Method, Errors in Cost of Living Indices, 
Their Unsatisfactory Character; Indices of Indus- 
trial Activity; Indices of Business Conditions; 
Uses of Index Numbers. Exercises. 


^ CHAPTBE Xni 

INDIAN AND FOREIGN INDEX NUMBERS 

Indian Index Numbers: Current Wholesale Price 
Index Numbers — Calcutta Index Number, Bombay 
Index Number, Economic Adviser’s Index Number, 
Their Inadequacy; Discontinued Wholesale and 
Retail Price Indices — Indices of Prices for 
Exported and Imported Articles, Indices of Re- 
tail Prices of Food Grains, Weighted Index Number 
of Wholesale Prices; Cost of Living Index Num- 
bers — Diversity in Scope and Construction, Bombay 
Working Class Cost of Living Index; Government 
of India’s Latest Schemes — The Main Cost of 
Living Index Number Scheme, Retail Price Index 
Number Scheme for Urban and Rural Centres; 
Industrial Activity Index — Capital ” Index of 
Indian Industrial Activity; British Index 
Numbers: Wliolesale Price Indices — Board of 

Trade Index, Economist Index, Statist Index; Cost 
of Living Index — Ministry of Labour’s; Indices of 
Production — London and Cambridge Index, Board 
of Trade Index; Indices of Business Activity — 
Economist’s Index; United States’ Index 
Numbers; Wholesale Price Index Numbers — 
Bureau of Labour Statistics’,' Federal Reserve 
Board’s, Dun’s, Annalist’s. Fisher’s; Cost of 
Living Index Number — Bureau of Labour Statis- 
tics’; Indices of Production — Harvard Committee’s; 
Indices of General Business Conditions — Harvard 
Index. Exercises 

^ CHAPTER XIV 

DIAGRAMMATIC REPRESENTATION 

Usefulness of Diagrams; Directions for drawing 
Diagrams; Different Forms of Diagram; One 


203—243 


244 — 269 



X 


Dimensional Diagrams — Simple Bar, Subdivided 
Bar; Two Dimensional Diagrams — Rectangles, 

Squares; Circular Diagrams— Circles ; Angular 
Diagrams — Sectors; Three Dimensional Diagrams 
— Cubes; Pictograms — Maps and Pictures; 

General Remarks. Exercises . . . . 270 300 


CHAPTER XV 

GRAPHICAL PRESENTATION 

Diagrammatic and Graphic Presentations Contrasted; 

Graphs of Continuous Time Series: Rules 

for drawing Graphs — Choice and adjustment of 
scales, plotting the data; Different Types of 
Graphs on the Natural Scale — ^Absolute Histori- 
gram of one Variable. Absolute Historigrams of two 
or more variables (Homogeneous and Heterogeneous 
Units), Index Historigrams to Compare Changes 
of two or more Variables. Metluxl of Scale Con- 
version for Comparing Changes in two or more 
Variables, False Base Line; Graphs on “Ratio’’ 

Scale — Ratio Scale, Logarithmic Curves, Instruc- 
tions for Reading of Logarithmic Curves. Advan- 
tages and Disadvantages of Ratio Scale; General 
Remarks; Frequency Graphs: Statistical Nature 
of a Group; Frequency' Graphs for Discrete 
Series; Frequency Graphs for Continuous Series — 

Histogram, Frequency Polygon, Frequency 
Curve, Ogive Curve; Galton’s Method of Locating 
the Median. Exercises . . , . . . 310 — 352 

^ V CHAPTER XVI 

ANALYSIS OF TIME SERIES 

Trend, Seasonal and Cyclical Fluctuations; ^^easuring 
and Isolating Time Changes; Elimination of Short- 
time Oscillations — Freehand Curve Method, 

Method of Moving Averages; Periodicity and 



xi 

Cyclical Fluctuations; The Smoothed Cu^ve; 
Elimination of Long-time Variations; Measuring 
Seasonal Variations; Comparison of Time Changes 
in two Historigrams, Exercises 

^ CHAPTER XVII 

^ : CORRELATION 


Meaning of Correlation; Degree of Correlation; Coeffi- 
cient of Correlation; Study of Correlation; Karl 
Pearson’s Coefficient of Correlation; Calculation of 
'Pearsonian Coefficient of Correlation — Direct 

Method, Short-Cut Method; Coefficient of Correla- 
tion for Long-time Changes; Pearson’s Motii- 
fied Coefficient for use with Short-time Oscillations; 
Calculation of Correlation Coefficient in Grouped 
Series; Assumptions of Pearsonian Correlation; 
Characteristics of Pearsonian Coefficient ; Probable 
Error of the Coefficient ; Interpretation of Correla- 
tion; Coefficient of concurrent Deviations; Correla- 
tion by Graphic Method; Graphic Correlation of 
Time Changes. Exercises. 



CHAPTER XVm 


ASSOCIATION OF ATTRIBUTES 


Statistical Attributes; Notation and ’rerininologv ; 
Probability and Expectation; Criterion of Indepen- 
dence; Association and Disassociation ; Coefficient 
of Association; Partial Association. Exercises .. 



CHAPTER XIX 


INTERPOLATION AND FORECASTING 


Necessity of Interpolation; Assumptions, Accuracy 
of Interpolation; Methods of Interpolation: The 
Graphic Method — Graphic Method in a Continuous 


353—373 


374 — 410 


411 — 427 



xii 

Series; Graphic Method and Periodic Figures, 
Graphie Method and Correlation Curves ; Algebraic 
Treatment — First Method — Fitting witli a Parabo- 
lic Curve, Second Method — By means of advancing 
differences, Third Method — Lagrange^s Formula; 
Forecasting; Conclusion. Exercises 


CHAPTER XX 

INTERPRETATION OF DATA 

Interpretation; Preliminaries to Interpretation; Mis- 
takes due to False Generalisation; Wrong Inter- 
pretation of Index Numbers; Wrong Interpretation 
of Coefficient of Correlation; Wrong Interpretation 
of Coefficient of Association; General Directions 
for Interpretation. Exercises 

Appendix lA Specimen of a Blank Form 

Appendix I B Specimen of a Questionnaire 

Appendix II List of Important Statistical Pub- 
lications 

Appendix III Measurement of the National Income 
of India 

Appendix IV Logarithm 

Mathematical Tables 
Index 



428—449 


450—461 

462— t65 
466— 470 

471—475 

476—482 

483—486 

487—506 

507—508 



STATISTICS : THEORY AND PRACTICE 


CHAPTER I 

GROWTH OF THE S('JEX(’E OF STATISTLCS 

The word ‘ Statisties ’ seems to have been derived from 
the Latin Status, nieaninj^’ a politieal state. In fact, the study 
of Statistics had its origin in the compilation of facts and 
tij^'ui’es for purposes of administration of state. In this sense, 
the subject must have been in existence from very earl} times. 
Jn the days of yore the ruling chiefs used to take, as often as 
necessary, a census of population and property within their 
domain to determine their man-power and material strength, 
and thereby planned their fiscal, and military policies. 
Collection of data for other purposes, however, was not ruled 
out. Perhaps one of the earliest enumerations made was 
regarding the population and riches of Egypt, taken about 
dUoO B.C., to plan the erection of l^yramids. But, the most 
common conijulations during the Middle Ages were concerned 
with taxation, distidbution of land and available soldiers. Jn 
India, administrative statistics, were highly organised nearly 
two thousand years, ago. Inscriptions and technical treatises 
abound in references to various kinds of statistics for th€ 
classic period of Sanskrit culture. A system of registration 
of births and deaths was enforced in Alaurya India, whilo 
Airi-i’Akbari—ii great administrative and statistical survey of 
India— was compiled during the reign of Emperor Akbar. 
Past history of other countries also bears witness to the fact 
that Statisties was originally concerned with matters of state 
and was regarded as the Science of Statecraft. 



2 


statistics: theory and practice 


Mercantilistic Period 

During the Mercantilist ie period the policies of the 
Western European governments were directed to the dual 
purpose of encouraging sueli industries as enhanced th(* power 
of the state, and of securing a favourable balance of trade. 
This necessitated legislation for social, economic and political 
reforms, which, to be effective and adequate, called for more 
comprehensive statistics than were considered sufficient during 
the Middle Ages. The bulk of statistical compilations, 
consequently, increased. 

16th Century. 

The ancient astronomers contributed much to the pi-opaga- 
tion of the study of Statistics. They compiled records of the 
motions of heavenly bodies, and predicted about eclipses and 
positions of stars. Upon a study of the data collected by 
Tycho Brahe (1546-1601) Johannes Kepler discovered the 
thrpe laws relating to the motion (>f planets on which the 
theory of gravitation was founded by Sir Isaac Kewton. 
When the utility of statistical method for attaining the 
knowledge of nature was demonstrated, enthusiasts in political, 
social and economic fields began resorting to a similar 
approach. Accumulation of large mass of data was the 
natural result. 

17th Century. 

The seventeenth century opened with a new use foj; some 
of the compiled figuies, viz. a study of vital and social 
statistics. In 1612, Profes.sor (leorgc Obrecht, of Strasburg 
University, illustrated how vital and criminal statistics could 
be utilised for devising plans to provide a system of lif( 
insurance and pensions and to reform the criminals. •Uaptaii; 
John Oraunt of London (1620-1674) made an analytical studj 
in the realm of vital statistics in 1661. Casper Neumani 
studied the death records of Breslau in 1691 and prepared hi; 



GROWTH OF THE SCIENCE OF STATISTICS 


3 


notes and conclusions, which fell into the hands of Edmund 
Halley, the famous astronomer and scientist, through the Royal 
Society of London. Halley computed from them a complete 
life table, deduced the expectation of life at each age and 
paved the way for a scientific system of life insurance. Sir 
William Petty (1623-1687) also drew up and discussed 
mortality tables. Indeed, the first life insurance institution 
was founded in Loudon in 1698. 

18th Century — Statistics and Mathematics. 

With the statistical data growing in abundance, and many 
new fields having been opened up for investigation, need was 
soon felt for Improving upon the hitherto used, crude and 
cumbersome, methods of analysing and interpreting the figures. 
The labours of Petty and Halley had prepared the ground 
for a more scientific treatment of the statistical methods in the 
eighteenth century, which they received particularly at the 
hands of Jh P. Sussmileh (1707-1767), a Prussian clergyman, 
who tried to demonstrate the doctrine of ^ Natural order ^ 
statistically, in an important publication. Others devised 
statistical tables and geometric figures for purposes of com- 
parison of data. But modern theory of Statistics was, thus 
far, conspicuous by its absence. John Graunt, Petty and 
Siissmilch conducted their studies, during the seventeenth and 
eighteenth centuries under the name of ‘ political arithmetic 
which functioned as eyes and ears of central government. 

In the eighteenth century, however, an alliance was 
effected between Statistics and Mathematics, and foundations 
of the theory of probability were laid, when J. Bernoulli 
(1654-1705), a professor of Basel, mathematically elucidated 
the ‘ Law of large numbers ’ in his work Ars Conjectundi, 
published posthumously, and Daniel Bernoulli (1700-1782) 
suggested the theory of ‘ moral expectation \ The subject of 
probability, it may be interesting to note, grew out of ^n 
analysis of hazards of those who played games of chance. 



4 


statistics: theory and practice 


Laplace^ the noted scientist, who followed up Sussmileh’s 
statistical studies raised the superstructure of the theory of 
probability in his creditable work published in 1812. And 
so did Uauss. But, it was left to the famous Belj*'ian 
astronomer and mathematician, Jj. A. J. Quetlet (179G-1874) 
to lay the foundations of the modern theory of statistics. 
Theory of Statistics, it should be emphasized, owes much to 
the mathematical theory of probability. Quetlet ^s meteoro- 
logical researches brought him to a study of vegetation, of 
animal kingdom, and then of mankind. L'pon a study of 
the physical, social and moral characteristics of men, he found 
that every phenomenon yielded similar results. In each case 
there existed an * average man ^ representing the average 
physical and mental qualities of society, and all otiicr men, 
in respect of any particular character, would diverge from the 
^ average man ^ with mathematical regularity. He found it to 
be true of all human actions, since he <iemonst rated that crimes, 
suicides, accidents all showed comparatively etjustant iigui’cs. 
He concluded that deviations from the ‘ average man ’ were 
subject to binomial law and that the methods of pro!)al)ility, 
which had proved useful in tre.ating errors of observation, 
could be profitably employed in Statistics. In fact, Quetlet 
recognised the significance of the principle of Constancy oj gm// 
numbers, upon which the modern theory of Statistics rests. 

19th Oentury — Statistics and Economics. 

SevxM-al eminent mathematicians made their contributions 
to the theory of probability during the nineteenth century, and 
the theory of >Statistics began its gradual and st(‘ady advance- 
ment. Knapp (1842-1926) and Lex is (1887-1914) in (leinnany 
followed up Quetlet \s principles and attempted a comprehensive 
study of the statistics of mortality. Sir hVancjji J[aItou 
(1822-1911), founder of the school of Eugenics, deserves a 
pioneer honour among workers on Statistics. He conducted 
an enquiry into the principles relating to the transmission of 



GROWTH OF THE SCIENCE OF STATISTICS 5 

mental and physical characteristics from one generation to 
another. Ilis enquiry helped his great successor. Karl 
Pearson (1857-1936), to produce his notable work on biometry 
and to emphasize the indispensability of Statistics for the 
evolutionist, as in his opinion the whole problem of evolution 
was a problem of statistics. 

Statistics appeared rather late in the field of the Science 
of Economics, though a beginning was made by Sir William 
Petty in his work, Political Arithmetic, published in 1G90, as 
also by (Gregory King, who, about the same time, attempted 
to statistically demonstrate a relationship between supply of 
commodities and prices. I*»y the eighteenth century valuable 
statistical material relating to population, occupations, taxes, 
agricmlture, industry, trade, shipping etc., had been collected 
in most civilized countries; but there was no liasion whatsoever 
between statistical information and economic theory. Political 
Economy was brought up in the classical school, founded by 
Adam Smith, through his great work Wealth of Nalions. 
puldished in 1776. Classical economists were staunch, believers 
in the deductive and abstract method of reasoning, their lip- 
sympathy such as that held by J. S. Mill (1806-1873) to the 
advantage of Statistical verification of deductive laws not- 
withstanding. W. S. Jevons (1835-1882), in his ^Theory of 
Political Economy’' published in 1871, also advocated verifica- 
tion of the deductive science of economics by the inductive 
science of statistics, and opined that political economy could 
be developed into an' exact science, if only commercial 
statistics were more complete and precise. Although Courn ot 
(1801-1877), a renowned mathematical Economist and writer 
on probability, did statistics a signal service by hinting at 
the application of the calculus of variations and making a 
first casual suggestion regarding the distinction between 
secular trend and periodic fiuctuations, yet, it was W. S. Jevons 
who segregated seasonal moveinents, secular trends, and 
cycles, much as the modern writers do. Jevons’ statistical 



6 


statistics: theory and practice 


work on prices was of a high order and he has been accorded 
the title of ‘ the father of index nuipbers He may be said 
to have put Statistics into economics. In his Theory of Political 
economy^ he wrote, * I know not when we shall have a perfect 
system of statistics, but the want of it is the only insuperable 
obstacle in the way of making economics an exact science." 
The most emphatic Aveight on the introduction of statistics 
into the study of economics, however, came from the Historical 
School (1843-1883). This school, of which Roschei*, Knies and 
Hildebrand were the representatives in (Germany and Cliff 
Leslie in England, believed that economic doctrines were not 
to be reasoned out in the abstract but to be historically or 
inductively proved. The school, liierefore, laid stress on 
history for past events and on statistics for the pres(‘iit ones. 
By the end of the nineteenth century the attitude of a large 
body of economists towards the inductive nn^tlnxls had become 
friendly. Alfred Marsliall could write by 1907 that disputes 
as to the nietTnuis of study of economics had ceas«*d. tliat 
qualitative analysis had performed the largei* ])ar*t of its j(»b, 
and progress in the (plant itaiive analysis depended upon the 
growth of realistic statistics. He assertixl tliat induction and 
deduction were both neede?l for scientifn* thought as tin* right 
and the left foot wei-e both in‘eded for walking. PaiaUo 
(1848-1923), whose work on Politi(*al Economy containctl a 
comprehensive collection of statistics, expresscMi tin* convi(*tion, 
in 1907, that the progress of (*c(>nomic science dej)end(Mi largady 
upon the inv(*stigation of empirical laws derived -from 
Statistics. In E. Edgeworth (I84r>-192b) coidd lx* found 
an economist, whose work on index numbers and correlation 
was particularly imp^irtant and who grcvitly advanced the 
solution of statistical problems, l uwne s, in his Scope and 
Method of Political Economy points cut that the function of 
statistics is first, to suggest empirical laws, which may or 


4th edition, page 12. 



GROWTH OF THE SCIENCE OF STATISTICS 


7 


may not be capable of subsequent deductive explanation; and 
secondly, to supplement deductive reasoning by checking its 
results, and submitting them to the test of experience It 
is now widely held that induction without deduction shalll^e 
barren, deduction without induction sterile. 

Since the last decade of the last century two important 
factors have brought about a fundamental change in the place 
of statistics in economics. Since eighteen-eightees. pure 
theory of statistics has made a remarkable improvement. 
Eminent men like August Meitzen, Francis Edgeworth, 
Francis (Jalton, Karl Pearson, 0. Udny Yule, C. H. Davenport, 
\V. S. (iossett, A. L. Kowley, Adams, W. Pei*sons, ^V. 1. King 
and K.“ A. Fisher have done a great deal in advancing the 
theory far beyond its foi*mer limits. The development of 
statistical methods — of pi-obability, sampling and curve-fitting, 
coi’relation, periodicity, and index-numbers — closely coincided 
with the enlargement of figurative data, made possible by the 
establishment of statistical bureaus and scieinifie recording ot , 
])oi)ulalion census in difi'erent (*ountries of the wtu’ld. These' 
imi)rovements in statistical material about the close of the 
ninete(*nth c(mtury mark the real incejMion of statistics in 
economics. 

Thus there has grown up a kin-shi]) among ^Mathematics 
E(‘ononiics and Statistics. The modern science of Statistics is 
no longer synonymous with ‘ political arithmetic \ It has 
extended its scope to vai'ied departments v>f human knowledge. 
It is concerned not merely with matters of state but also with 
the physical, biological, anthropological, meteorological, social, 
economic and other phenomena. Its methods are applied wher- 
ever a study of large numbers is involved. 


ed. page 338. 



8 


statistics: theory and practice 


EXERCISES 

(1) Trace the growth of the science of Statistics, and throw 
light on its future, 

(2) Mathematics has played in the past, as it docs even today, 
a great part in Statistical Theory, and there could be no theory 
without it, but that theory is no more a branch of mathematics 
than is, say, engineering or astronomy. — Discuss. 

(:i) Explain the relationship between Economics and 
Statistics. How far has the use of statistical methods in Econo- 
mics led to its development.^ 

(IE Com., Lucknow, 1J)12). 

(I) Statistics was originally concerned with matters of state 
and was regarded as the Science of Statecraft. 

Show, in the light of the above statement, how Statistics could 
have been of use and necessity to state, in ancient times. Is it of 
some utility today also? 

(5) How far has the growth of Statistics e(»incided with the 
development of natural and so<ial sciences.^ 

(d) Statistics was born in the needs of state administration, 
but is no longer concerned with matters of stati — In the light of 
this statement, show how and what transformation has taken place 
in the meaning of statistics. 

(7) Show tht‘ relationsliip between statistiis ami matiumiatics, 
and statistics and natural sciences. 

(8j Do you think that Statistics is an apj)aralus througli 
which the validity of the laws of natural and social seiences„ can 
be tested.^ 

If yes, would these sciences have made the f)rogrt‘Ss tin y ha\ e 
done in the absence of Statistics.^ 

(9) How and when did Statistics come to be relatt‘d with 
Economics.^ Has this ndationship been of mutual good 

(10) Throw some light on the importance of Statistics from 
its past history, and di.saiss its indispensability in all modern 
studies. 



CHAPTER II 


DEFINITION OF STATISTICS 

Statistics* are numerical statements of facts in any 
department of inquiry, placed in relation to ^dhi others 
Isolated, unconnected figures are not statistics. 20, 3S, 67, or 
15, 82, 55 are undoubtedly quantitative fifiures, but not 
statistics. For, neither do they concern a sphere of inquiry, 
nor are they placed in relation to each other. But when we 
are told that for husbands, in a certain eoinnuinity, at a^es 
20, 67 years and so on, the coi'respondinjz a«:es of wives 

are 15, 32, 55 years and so on, these fij^ures at once become 
statistics. For, now they throw lij^ht on the relationship 
between the aj>es of husbands and wives in the ^iven 
coniinunity. 

Characteristics of Statistics 

Sevei'al facts emerge from the above. Firstly, statistics 
must be quantitatively expressed. Qualitative expressions like 
^ youn^ ‘ niiddle-ajifed ' and ‘ old ' have been indicated by 
numerical expressions like 20, 38, and 67 or 15, 32, and 55. 
Crops over a series of years, expressed in maunds per acre, 
are statistics, but expressed by^ such terms as ‘ ^ood \ ‘ fair". 
* noimial or ‘ poor are not. Secondly, statistics are always 
aggregates. A sinjirle a^e of 20 or 38 years is not statistics; 
a series of a«:es are. A single birth, a sale or a consip;nnient 
does not form statistics. Yet numbers of births, sales, consig:!!- 
ments are statistics, since they can be studied in relation to 

* The term statistics is applied to the science of statistics as well 
as to its subject matter. In the former sense it is iiseii as sinpfular, in 
the latter as plural, noun. 

' Bowlev, A. L: **An Elementary Manual of Statistic^/* page 1, 

9 



10 


statistics: theory and practice 


time» place and frequency of occurrence. Thirdly, statistics 
should relate to a department of inquiry. That is, the sphere 
on which they are to throw light must be definite and clear. 
Their purpose and object must be pre-determined. The 
purpose of a series of ages of husbands and wives may be to 
find whether young husbands have young wives and llie old, 
old. Fourthly, Statistics must be capable of being* placed in 
relation to each other. That is, they should be compara))le. 
To be so they should be homogeneous: they cannot be sti'ay 
numerical facts, unrelated data, culled frcmi indiscriminate 
sources having no common basis for selection. Ages of 
husbands are to l)e compared inily with the corresponding ages 
of wives. 

But these are not the only characteristics which stati>.tics 
possess. It should be added that statistics are affected to a 
considerable extent by a multitude of causes. TIk'v ar*(^ 
hardly ever traceable to a sinale caiisi^. They arc rchjt<‘d to 
other facts. The stature of a man, c.g., is ('ausally connected 
with his race, ancestry, din, anc. occupation, climate and 
ha])it. 

Statistical Methods. 

These statistics constitulr the law mat(‘rial must 

pass through certain meeiianic'al i»M»c(‘sses to yield tnished 
products. The most important step to l>e taktai is to r(Miuce 
the multiple causes aff'ectiiii: tlu* data to a (*om])arativc!y small 
residuum. For this purj)os'-. tlic i‘X])frimrntal m(*thod has 
been carried to perfection in sciences like ])hysics and 
chemistry, where the data are cajmblo of ]n*ing measured with 
reasonable accuracy, the salient factors of the problem arc f(‘w 
and simple and conditions arc within the cxpcrimcntor’s 
control. These eircumstances favour the applicatit)n of 
experimental method. Experiment has the merit of replacing 
complex and varied systems of causation by simpb* ones, 
where only one cause would vary at a time. Hut, the experi- 



DEFINITION OF STATISTICS 


11 


mental method proves unsatisfactory and inadequate in \ 
sciences like biology and sociology, i|(|jhiere the data are mixed * 
with extraneous, irrelevant matters, factors of the problem are 
many and complex and conditions are not under control. ’ 
These circumstances do not permit experiment. For this 
reason, physics and chemistry are classed as exact, and biology 
and sociology as inexact, sciences. The inexact sciences, denied 
sin^lification of data through the experimental method, have 
in general to deal with data as they occur — data affected to a 
large extent by a number of alternative causes acting jointly 
or severally. They apply some method other than the experi- 
mental one to render their complex and unwieldy data intelli- 
gible. To serve this end. important and more persistent factors 
influencing the data have to be segregated from the casual 
disturbances that cancel out in the long run, and th^ extent 
to wliicli the o))served efi'eci results from the opei*ation of each 
one of the former factors has to be studied. The collected 
figures are scientifically analysed ; they are classified, tabulated, 
compared, correlated and finally interpreted. Methods em- 
ployed in these different processes are termed ‘ Statistical 
Methods \ Thus, Statistical methods are the devices by the 
application of which quantitative data influenced by multiple 
causation are collected and so scientifically analysed and 
elucidated that they are brought within easy and clear grasp. 

It should not be understood that the machiiiery of 
statistical methods is employed only in the inexact sciences. 
Statistical methods may also be profitably used in exact 
sciences, for whatever the perfection of the expei'iiiiental 
devices they ean hardly ever ])e absolutely perfect. The 
observer of physieal or chemical phenomenon and the instru- 
ment of observatitui are both sources of error; and the effects 
of changes of moisture, pressure, temperature etc. cannot be 
totally eliminated. Statistical methods are, therefore, the j 
handmaid of both the exact and the inexact sciences, but are : 
. of greater service to the latter. Experimental method is, of 



12 


statistics: theory and practice 


course, more precise than the statistical one, but the latter 
often affords fairly successful results when the former fails. 

The finished products resulting from statistical analysis 
are also known as statistics, e.g., statistics of foreign trade 
of India. Thus both the raw material to which statistical 
methods are applied and the resulting finished products are 
termed as statistics. It will be seen later (in Chapter Y; that 
the raw materials are called Primary Data and the finished 
products Secondary Data. Statistical methods and statistics 
are intimately connected, since the quality of goods turned 
out depends on the perfection of the machine producing them. 
Statistical methods are concerned with the technique of 
collection of data, with their analysis and interpretation. A 
scientific exposition of the.se methods should, therefoi'i, be 
named the Theory of Statistics. 

Science of Statistics defined. 

Different authors have given different definitions of 
statistics emphasizing different aspects. Webster defines 
Statistics as (dassified facts respecting the condition of the 

people in a state especially those facts whicli can be 

stated in numbers or in tables of jiiimbers or in any tabular 
or classified ari'angement.''* This definition is much in keeping 
with the original meaning of the term statistics, viz., seieiu'c 
of statecraft. In its modern sense the term is not confined 
to ‘ the condition of the people in a state \ but has stretched 
itself to almost every phenomeuon--biulogical, astronomi<*al, 
social, meteorological — where a study of large numbers is 
involved. Web-ster’s definition is therefore inadequate to 
depict the modern notion about statistics. According to 
Dowley '‘Statistics is the science of the measurement of social 
organism, regarded as a whole in all its manifestations.”' 
This definition, according to its author, concerns the student 

* Quoted by King, W. I., Elements of }fr(hotls.^ piigr 

* Bowley, A. L., Elements of Staiistios. page 7. 



DEFINITION OF STATISTICS 


13 


of sociology, political economy or demography. But when the 
author recognises that ' Statistics is not merely a branch of 
political economy nor is it confined to any one science/^ its 
definition should not have been so drawn up as to limit its 
operations to only one field — viz. that of man and his activities. 
This definition is, therefore, not sufficiently inclusive. Tkiwley 
further observes : ' Statistics may rightly be called the science 

of averages.^"' No doubt, averages present a birdVej^e view 
of a mass of unintelligible data, but there are other equally 
serviceable devices, such as graphs, pictograms, correlation 
tables and co-effi'eients, which modern statistics utilizes to 
comprehend the significance of the complex quantitative data. 
Thei'cfore, while this defiiution does not confine the scope of 
the §;cienee to a particular phenomenon, it is still inadequate 
in so far as it stresses only one of the several statistical 
methods. 


Suggesting a possible definition Bowley says that 
Statistics may be called “ the science of counting Analys- 
ing this definition he observes that while dealing with large 
numbers, such as a population census, counting is neither easy 
nor Avithin the reach of an individual, (treat numbers, instead 
of being counted, are estimated. Kven estimation requires 
the co-operation of a grou]) of people, since the numbers with 
which statistics concerns are very great in number. But be- 
cause of varying degrees of intelligence and sense of accuracy 
among a group of workers, and also because of the difficulty 
of so clearly defining the object to be counted that everj^ 
worker shall understand the same thing by the same definition, 
the estimates are not mathematically exact. Bowdey then 
concludes that ‘ though all estimates of this nature are some- 
times included under the term Statistics, this definition at once 


^Bowley, A. L., Ehmenis of Statistics, page 4. 
^ IhUl, page 7. 

* Ihidf page 3. 



14 


statistics: theory and practice 


is too wide, and also does not bring out the distinctivf nature 
of statistical method Obviously, this definition suffers from 
the dual defect of emphasizing the method of counting used in 
arithmetic rather than that of estimation on which Statistics 
so much relies, and of taking into account only the colJection 
of data, leaving the analysis of the collected data quite out of 
consideration. Therefore, this definition is also far too 
restricted, though it does not ))ind down the scope of statistics 
to a particular field of enquiry. 

Boddington denominates statistics as a ‘ science of 
estimates and probabilities This is a narrow point of view. 
Estimates and probabilities are only a part and not the whole 
of statistics. King defines statistics thus : ‘ The science of 

statistics is the method of judging collective natural or social 
phenomena from the results obtained by the analysis of an 
enumeration or collection of estimates The author himself 
regards it as possible that statistical p?*oblems such as would 
fall outside the limits of this definition might be imagiiied but 
maintains that it is sufficiently broad for pra<*tical purposes. 

In order that the definition of the Science of Statistics 
may suit the modern sense of statistics it should be so framed 
as to include all that is rightly its and to exclmle what is 
extraneous to it. Enough has already been said f)f what 
statistics and statistical methods are, yet a few observations 
are necessary before arriving at a suital)le definition. 
Statistics is concerned with mass phenomena, witli larg.‘ 
numbers descriptive of groups, with results of coll, ‘(dive 
action. Individual facts and figures may be of intere.st to an 
individual: Statistics does not deal with them. The earnings 
of employees in a business may vary fr*om man to man; a 
worker may earn Rs. 7 in a certain week, or an average of 
Rs. 5-8-6 per week and feel jubilant. over it. The employer 
may also feel satisfied with what he gets for what he pa^’s. 


^ Ibid y page' 4. 

•Boddington, A, L., fitatistioa mid ih^dr applieiiiUm 1o Ciymmerecy p. 7. 
•King, W. I,, Elements of StatistuDal Method, p. 23. 



DEFINITION OF STATISTICS 


15 


To the business as a whole, however, the result of a worker 
labour is an unit in the cost of production. Neither are these 
figures statistics, nor does statistics as a science deal with 
them. If, however, we compare the total earnings of the group 
with other elements in the business, say with its turnover, we 
arrive at a clearly defined relationship between them. This 
relationship should hold good in normal circumstances. Here 
we have a statistical study. Individual peculiarities coun t fo r 
nothing in statistical study. It is the possession of the aame 
peculiarities by the whole or a majority of the constituents of 
the group that is significant. The data that are collected, that 
is estimated or counted, are influenced by a multitude of 
causes. Statistics analyzes them. Jn studying the properties 
of such aggregates it employs methods that are based on 
particular characteristics of large numbers. For instance, a 
characteristic of large numbers and averages derived from 
them is that they enjoy great inertia : individual incomes may 
change very fast, total or average incojue varies very little. 
Through such statistical methods accuracy of statements is 
examined, complicated data are analyzed and one estimate is 
compared with another. All those estimates to which these 
methods apply fall within the scope of statistic^s. Statistics is, 
therefore, not confined to any particular l)ranch of human 
knowledge: it is all-pervading. Theory of statistics should 
then comprize an exposition of statistical methods. A simple 
definition of Statistics may run thus: The science of statistics 
is a study of the methods applied in collecting', anal3rziiig and 
interpreting quantitative data, affected by multiple causation, 
in any department of inquiry. 

Functions of a Statistician. 

The functions of a statistician are then simple. He is con- 
cerned, firstly, with the collection of statistical data, secondly, 
with their analysis and finally with the interpretation of the 
results of such analysis. Sometimes a sort of division of labour 



16 


statistics: theoey and practice 


may be noticed in that the statistician may be engaged only 
on the analysis of data without bothering’ himself about the 
methods of collection or about the interpretations that may 
be put to his results. But such a division of duties may not 
always result in the best elucidation of a given problem. 

A statistician cannot work miracles. He is not an alchemist 
expected to produce gold from any worthless material. He is 
rather like a chemist capable of assaying, the value the 
material contains and of extracting nothing more tlian this 
value. It would, then, be no use commending a statistician 
because his results are precise noi* condemning him l)ecause 
they are not. If he is gifted with the competence his vraft 
demands, the value of his results shall follow solely from 
the material he analyses. His job is only to produce what the 
material contains, and no more. A necessary qualification of 
a statistician is that he must be an impartial umpire, free from 
fear or favour. His peivsonal prejudice should not be allowed 
to affect the conduct of his daily duties. 

Main Divisions of Statistics. 

The domain of statistics can be generally classified into 
two main divisions: Statistical Method and Applied Statistics. 

Statistical Method is concerned with the formulation of the 
general rules and principles applicable in handling different 
branches of data, e.g. the methods of collection of data, classi- 
fication, tabulation, comparison by means of averages, diagrams 
and coefficients, correlation, etc. 

Applied Statistics deals with the application of llnse rules 
and formulae to concrete subject-matter like wages, prices, 
trade, population. Applied Statistics may consist of biometry, 
psychometry, vital statistics, administrative, social and econo* 
mic statistics. The last three are of immense importance, and 
we shall be concerned generally with them. 



BEftNITiOK 0 ¥ STATISTICS 


17 


EXERCISES 

(1) Explain clearly the concepts of Statistics, Statistical 
methods and Statistical science. 

(2) What are the characteristics that Statistics — statis- 
tical data — possess? Explain them with illustrations. 

(3) * By statistics we mean quantitative data affected to a 
marked extent by a multiplicity of causes ’ — Explain. 

(4) Statistics are ‘ aggregates of facts, “ affected to a marked 
extent by multiplicity of causes,'* numerically expressed, enumerat- 
ed, or estimated according to reasonable standards of accuracy, 
collected in a systematic manner for a predetermined purpose, and 
place(\ in relation to each other.* (Secrist) — Elucidate. 

(5) Comment on the following definitions of Statistics — 

(а) By Theory of Statistics or, more briefly, statistics we 

mean the exposition of statistical methods. 

(б) Statistics is the branch of scientific method which deals 

with the numerical aspects of aggregates of natural 
phenomena. 

(c) The theory of statistics comprises an analysis and inter- 

pretation of systematic collection of numbers relating 
to the enumeration of great classes. 

(d) Statistics is that branch of science which deals with 

the frequency of occurrence of different kinds of 
things or with the frequency of occurrence of 
different attributes of things. 

(e) Statistics is the science of estimates and probabilities. 

(/) Statistics is the science of counting. 

(d) Explain the different methods adopted by the natural 
sciences and the social sciences for the elucidation of their data. 

(7) Define Statistics and point out tlie main difficulties 

. F .— 2 



18 


statistics: theouy and practice 


that a statistician has to face as compared with a physicist or 
chemist. 

(8) Differentiate between statistics and mere mass of figures. 

(9) Mention the different kinds of statistical methods generally 
used in investigations. Are there any fields of inquiry where these 
methods cannot be used satisfactorily 

(B. Com. Agra, 1940). 

(10) ‘ Statistical methods include all those devices of analysis 
and synthesis by means of which statistics are scientifically 
collected and used to explain or describe phenomena either in their 
individual or related capacities.* Elucidate the above statement. 

(11) By statistical methods we mean methods specially adapt- 
ed la the elucidation of quantitative data affected by a multipli- 
city of causes. — Comment. 

(12) What are the main functions of a Statistician? Also 
point out the essential qualifications that one to be called 
statistician should possess. 

(13) Illustrate with suitable examples the main divisions 
of statistics. 

(14) ‘Statistics is co-operative counting’ — Explain. 



CHAPTER III 


FUNCTIONS AND IMPORTANCE OF STATISTICS 

FuBctions of Statistics. 

Human mind is unable to assimilate a mass of complicated 
data at any one moment. One can hardly form an unquestion- 
able opinion regarding the comparative examination standards^ 
-of two universities if he were told the marks obtained by 
every one of the two thousand students of each university. 
But if these unweildy complex data were simplified, reduced to 
totals 01* averages or presented through diagrams they would 
become readily intelligible. The science of statistics performs 
these functions. It boils down complex data to simple 
representative numbers easily adaptable to human mind. In a 
word, statistics simplifies complexity. 

Statistics enlarges individual experience. One may 
exercise his best ability and power of judgment to view the 
quantitative significance of a phenomenon. For instance, one 
may make a guess about Indians national income at a particular 
time. But such guesses are subject to vagueness, inaccuracy 
and personal prejudices, in the absence of adequate statistical 
data. And, when one proceeds to examine the accuracy of 
his statement he finds himself in the realm of statistical 
investigation. A statistical estimate is always better than the 
conjecture of a casual observer. 

Statistics compares the simplified data and measures their 
relationship. To appreciate the meaning of one estimate we 
often need another for comparison. A statement of water-rates 
charged in a certain town is meaningless if a similar statement 
for other equally important towns in the country is not forth- 
coming. It is, therefore, the relative or comparative, not so 

19 



20 


STATISTICS THEOTRY AND PRACTICE 


much the absolute, character of statistics that requires our 
attention. But while making,- comparisons, due allowance must 
be made for ditferences in the circumstances prevailing between 
two periods, or two countries, as the case may be. For instance, 
if the water-rates charged in a town whose source of water 
supply is situated at a considerable distance be compared with 
the water-rates charged in another town which has easy access 
to the source of water-sirpply, the result is bound to be vitiated 
if due allowance is not made for the differences in the condi- 
tions of the two towns. 

Importance of Statistics. 

Statistics has been termed the science of kings. Indeed, 
in ancient times it kept the kings informed about tlie man- 
power and riches of their domain. What are now called 
statistical studies were in the past conducted under the name 
of political arithmetic. Civilization has now advanced since 
then, and the application of the mathematical theory of 
probability to social phenomena has yielded an ingenious 
apparatus to deal with the figures of wealth and welfare. It 
wilt not be inappropriate, then, to name statistics as the 
arithmetic of human welfare to-day. Statistics is indispensable 
these days for a clearer appreciation of ‘any problem affecting 
the welfare of mankind. Froblems relating to i)overty, un- 
employment, food shortage, protective tariffs, uneconomic 
agricultural holdings etc. cannot be fully weighed without the 
statistical balance. These are the days of planning. Planning 
without statistics cannot be imagined. Neither can a policy 
be scientifically chalked out, nor can its success be measured 
without the statistical apparatus. 

Statistics is the light-bearer that enlightens the way to 
life’s adventures. It unravels the crowded complexities of 
life and thought. Without its support man would wander 
aimlessly through this perplexing universe. Statistics discloses 



FUNCTIONS AND IMPORTANCE OF STATISTICS 21 

causial comLectioii between related facts. Such study is at the 
bottom of all sound human endeavour. 

Statistics are the eyes of administration. No statesman 
can tender sound advice on a problem to his government unless 
he has adequate statistical data before him to base his 
judgment upon. Crime, drink-evil, tuberculosis and similar 
maladies need statistical investigation to suggest remedies for 
their cure. Budget, a collection of the estimates of revenue 
and expenditure of a state for the ensuing year, is an unavoid- 
able necessity for an efficient running of government 
machinery. Its preparation is not possible without statistical 
records and without their utilization by a personnel having 
knowledge of statistical methods. Once the budget is ready, 
decisions regarding enhancement or decrease in the existing 
rates of taxation or regarding the exploration of new sources 
of revenue can then be taken up without much trouble. 

Statistics are an aid to supervision, particularly in these 
days of impersonal relationship between the employer and the 
employee. Every institution, eoninurcial or otherwise, aims 
at obtaining efficiency with economy. Old plans are substituted 
by new ones. To test the effectiveness of new policies, the 
offi*cers must be provided with accurate and concisely tabulated 
information showing the results obtained. 

Statistics are invaluable in business and commerce. To 
be successful, a producer or a dealer should first estimate the 
demand for his wares, analyze the possible effects of factors 
like seasonal variations in demand, changes in taste, in 
fashions and in purchasing power of money, and then proceed 
to adjust Ids output or purchases to the estimates of demand. 
For, if he does not ann himself with a cautious and fairly 
accurate estimate, he would either be erring on the side of 
over-stocking himself and thereby suffering loss, or under- 
stocking himself and, therefore, losing chances of making 
profit. Business, indeed runs on estimates and probabilities. 
The higher the degree of accuracy of a businessman's estimates, 



22 


statistics: theoky and puactice 


the greater is the success attending on his business. (.Correct 
estimation demands a high class of skill which only long 
experience can ensure. Statistics helps the recording of the 
past knowledge and experience, and drawing out stainlards 
or ‘ types with which results from year to year can be 
compared, reasons for changes deduced, and effects of such 
changes on future studied. Experience so ascertained acts as 
an economic barometer. The businessman, forewarned of a 
currency inflation, boom or depression, shall prepare himself 
to face it boldly. Statistics is closely associated with economic 
progress. Statistics can be profitably employed in the different 
branches of commercial activity. It is being applied in cost 
accounting. The accounts of a concern undoubtedly show its 
financiaJ position, but they alone cannot correctly indicate 
business activity. Statistical averages or indices shall have 
to be computed for reliable conclusions. Statistics can help 
the launching of new projects and exploitation of potential 
markets. In a word, statistics is the life-blood of successful 
commerce. 

An underwriter, a stock-exchange broker or an investor in 
securities needs a knowledge of interest rates, the fluctuations 
of investment mai*ket and other data to strike a timely bargain. 
A banker, intending to build up a pyramid of credit, should 
have an adequate knowledge of the seasonal variations in the 
calls for money on his bank to decide the amount of reserve 
that he should keep from time to time in his vaults. A railway 
operating over a wide area has numerous sources of possible 
wasteful expenditure. Had working, such as usiiig four 
engines where three will do. or handling three tons where four 
should be disposed off, is costly to the railway itself. 
Similarly, inability to clearly gauge the necessity of j'unning 
special trains is not only to cause inconvenience to its patrons, 
but also to deny to the railway the revenue that could have 
been its own, if only conclusions had been drawn from past 
experience based on statistical records. The rod of statistics 



FUNCTIONS AND IMPORTANCE OF STATISTICS 23 

is, therefore, indispensable for railway company to keep its 
workiiif^ within the bounds of efficiency and economy. All 
forms of insurance subsist on precise calculations based on 
the analysis of a hu^e mass of data. The entire working of 
life assurance schemes, for instance, rests on the compilation 
of life tables and computation of expectation of life from time 
to time. Unemployment or sickness insurance similarly 
depends on statistical data. Again, in order that a social or 
economic legislation may be fair and judicious it should be 
based on carefully recorded quantitative information. 
p]nactments with regard to poor-relief, fixation of rate of 
exchange, levying of excess profits tax or stoppage of child 
marriage need a proper statistical investigation. The merit of 
the recommendations of various committees and commissions 
largely rests upon the statistical information behind them and 
the correctness of the statistical methods used. 

Statistics is indispensable in a quantitative study. Its 

methods can be usefully employed in any science. A 
sociologist may attempt to demonstrate a relationship 
l)etweeu sales of liquor and crime, or between suicide 
and poverty. A theoretical economist may make l)ricks 
out of the stiviw of statistics. -He may deduce im- 
portant economic ])rinci])les from empirical data, oi* verify 
the validity of deductive laws of economics by the inductive 
method of statistics. If he wants to study the march of time 
as revealed })y the trend of population and the workUs produc- 
tion of food, the relation of changes in the value of currency 
to prices, the relation of railway freights* to internal and 
external trade, the incidence of taxes, the influence of wages 
on health and efficiency of labour, the effects of irrigation etc., 
he must take recourse to statistical investigation. A serious 
danger in the absence of statistical information is to make 
random arbitrary estimates to suit one^s pre-conceived ideas 
and pet notions. Statistics brings truth to light and corrects 
faulty observation. An economist should equip himself with 



24 statistics: theory and practice 

a knowledge of statistical methods to guard himself against 
possible fallacies of argument. Statistics has much furthered 
the development of economics and if the data are considerably 
large and reliable, and correct statistical methods have been 
used in collecting and analyzing them, even forecasts can be 
successfully made. 

Statistical methods are extensively applicable. Astronomy 
pioneered their use in predicting about eclipses, and biology 
has equally appreciably utilized them in its generalizations, 
for instance, in laws of variations and heredity. Meteorolog^y 
uses them for weather forecasts. Tn these sciences and physics 
geology, zoology etc., with whatever care and caution may the 
measurements be taken, they cannot always be mathematically 
exact. And, so, an important problem to be attacked in .them 
is to compute the most probable estimate — an average or a 
type — from a complex group of observations, about which all 
the measurements are grouped in accordance with some definite 
law. The next task is to watch the nature arid direction of 
the changes in the type or grouping of the measurements about 
it. Upon such study are based several generalizations and 
theories of these sciences, which, were they made arbitrarily 
arid without statistical basis, could not be fully relied upon 
as true. 

Statistical methods provide the only precise manner of 
measuring numerical changes in complex groups and judging 
collective phenomena. Statistical bureaus are being main- 
tained in almost all civilized countiies of the world by the 
governments, financial and commercial houses, railw\ays and 
other institutions. And the wholesome services which they 
are performing more than compensate and justify the cost of 
keeping them in being. 

Limitations of Statistics. 

With air its usefulness statistics has certain limitations 
which should be carefully iioted. Statistics deals with a series 
of observed data in which individual items may considerably 



FUNCTIONS AND IMPORTANCE OF STAtIsTICS 25 


differ from each other. In rendering them intelligible, it 
computes averages, where these irregularities are brushed off. 
In computing the averages we proceed on the assumption that 
these differences are not significant, but, this assumption, 
though generally correct, may not always be true to facts. 
For example, the total number of people engaged in hazardous 
jobs in a country may be but a fraction of the entire popula- 
tion, and the actual number of victims to hazards even in this 
fraction may be very small, so that the general average may 
not be appreciably effected. But this does not lessen the 
torture of the victims, and affords no reason why they should 
not be protected. A limitation of statistics, therefore, is that 
it oannot take cogiiisance of individual items. Consequently, 
whqj^e a study of individual constituents of a group is 
important other means should be resorted to. 

Since individual peculiarities of items are merged into 
the average, the average should not be taken to imply more 
than what it means. It merely indicates the central position 
of the given data and does not tell the whole story. The fact, 
that the average percentage of marks obtained l)y two 
candidates in three successive examinations is the same, does 
not bring out whether one is deteriorating and the other 
making a progress. Statistical analysis exhibits a characteristic 
or trend of the given figures. Statistical results should not 
always be treated as the sole determinants of the value of a 
group, let they are as necessary for the study of a 
phenomenon as accurate measurements are for the construction 
of a building. 

h'urther, statistical laws are true on the average or in 

the long run. They are not like the exact laws of physical 
sciences which are said to hold true in every individual case 
that is subject to them. Statistical laws, therefore, show 
approximate, tendencies, e.g., Pareto’s law of income distribu- 
tion. Not only that, statistical data must also be statistically 
Tuuform. That is, the data should belong to a causal system 



2f> 


statistics: theory and practice 


that is highly stable so that there shall be no material fluctua- 
tions in its main characteristics over the whole field of 
observation. Without homogeneity oi data comparisons would 
be vitiated. 

Statistical metbods are not applicable to the study of 
those facts that are not quantitatively measurable. Health, 
culture, character, friendship and skill are as good things to 
acquire as poverty, cruelty and pessimism are to eradicate, 
but there is no quantitative unit in which they can be 
expressed and compared. Jn such eases the statistical aspect 
may be subsidiary to other, considerations. A comparison of 
the ‘ state of civilization ' between two countries does not lend 
itself to statistical treatment. Resort may be had to such 
numerical data as the number of persons passing a certain 
standard examination, the number of places of worship or 
entertainments, or the number of people convicted of crime. 
But these figures only indirectly relate to the real pioblem. 
They are subsidiary to other information like the manner in 
which people in the two countries live, the value they attach 
to principles of right conduct, the treatment they mete out 
to others, the type of work they perform, the food they take 
and so on. Therefore, it follows that statistical methods are 
not of universal use or v«alidity. Their use is confined to 
quantitative studies. 

But the greatest limitation of statistics is that only one 
who has an expert knowled,ge of statistical methods can 
scientifically handle statistical data, since statist iis, like 
medicines in the hands of quacks, are capa])le of being easily 
misused })y the inexpert. One might harness statistics to bis 
aid and make the w'orse appear to be the better case. Many 
people, therefore, look at figurevS with an eye of suspi(*»on. 

Distrust of Statistics. 

There are said to be three degrees of comparison in 



FUNCTIONS AND IMPORTANCE OF STATISTICS 27 


lying: lies, damned lies, and statistics. One may not believe 
in the truth of a statement. But, when he is presented with 
figures in its support he is led to believe — ' if figures say so 
it can^t be otherwise Such is the power statistics enjoy. 
And if this power is misused, say figures are deliberately 
manipulated, one may be, for the time being, led to accept an 
utterly false statement as an absolutely accurate fact ; but 
truth will be out some time later and when it is out figures 
that were cited in support of the statement would be labelled 
as lies. Cases, where figures have been put forward as an 
evidence of the accuracy of a statement otherwise wronjg, are 
not wanting and since lies and damned lies can be detected 
by a lay mind much more easily than a misuse of statistics, 
statistics have suffered the stigma of being classed with lies. 

But reasons for disrepute cannot lie with statistics. By 
themselves they carry no weight. They can vsupport false 
conclusions just as easily as they support the true ones. With 
them one may ‘ prove ^ that ineome per capita in India is high 
while others may ‘ prove ^ it to be low. What are these 
diametrically opposite conclusions due to? They are due 
t ither to motive — design — or to ignorance. Thus it is these 
reasons which lead 1o misuse or abuse of statistics, which in 
their turn lead to disrepute. It is not statistics that are lies. 
Thty are only tools in the hands of statisticians. If tools are 
abused or misused it is not tools which are bad. The fault 
lies with the way in which the tools are used. 

The popular distrust against statistics is generally 
exiiressed in the remark ‘ statistics can prove anything \ 
Those who say so are thonselves at fault to a large extent. 
As a matter of fact, little or nothing can be proved by statistics. 
AVhat can be done by them is to describe a phenomenon 
quantitatively, classify it into parts, summarize the facts 
relating to them and prepare the ground for a logical 
inference. Very ofteii too much faith is placed in figures alone 
and it is believed that the inference to which statistics lead is 



28 


statistics: theory and practice 


the only inference possible, that that inference is infallible 
and therefore need not be supplemented or verified by other 
than statistical evidence. This is what should not be. This 
has tended to bring the science of statistics and figurative 
data into discredit. Conclusions must be made in part on 
evidence other than that offered by statistics. 

If figures are quoted shorn of their context, they are 
applied to a phenomenon other than the one to wliich they 
really relate, figures relating to a part of a group are given as 
relating to the whole, figures favourable to an argument are 
stated omitting the othei- side, they are inaccurately compiled, 
deliberately manipulated and unscientifically interpreted — in all 
these cases they can be made to pi-oduce a false statistical 
argument. All these apprehensions make many a man look 
at statistics with a jaundiced eye. 

Statistics suffer from the draw-back that they do not 
always bear on their face the mark of their quality. To a 
casual observer, a c?*ude table compiled from unreliable data 
looks as valuable as another prepared after gi*eat pains by a 
number of trained statisticians. Most often, people who are 
served with statistical infoianation do not know whether a 
particular factor can be statistically evalued, or whether the 
information is based on satisfactory data. If they are men 
who believe that ' figures won't lie ' they shall accept them 
without question, while others who are sceptical of their truth 
shall treat them as ‘ tissues of falsehood '. In fact, ])efore 
accepting or rejecting statistics, one should enquire into the 
competence of their author. Another apparent draw-])ack of 
statistics is that since they are expressed as definite, concrete 
quantities they look innocent and precise, and people often 
believe them to depict an accurate picture. But if they are 
disillusioned they blame the statistics. It should he noted 
that their appearance in quantitative form is not a guarantee 
of their accurately presenting the phenomenon to which they 
relate. They show only one method of doing so. 



FUNCTIONS AND IMPORTANCE OF STATISTICS 29 

Whatever be the distrust of statistics it does not imply 
that statistics have no value, or the science of statistics is 
useless. Drugs may be misused, but neither is their usefulness 
lost, nor does the science of medicine become valueless. No 
doubt statistics do not supply conclusions but they do furnish, 
in part, the basis on which conclusions may be drawn. Their 
usefulness is, therefore, great. It is imperative then, that 
statistical data should be handled only })y those who are aware 
of their use, limitations and dangers and are free from 
prejudice. If their limitations are forgotten fallacious con- 
clusions would result. If data are carefully collected and 
scientifically analyzed, the results obtained shall be trust- 
worthy. With the study of statistics as a science,' with the 
recognition of its limitations and with improvements in its 
technique the cause for its distrust is gradually waning. A 
layjnan is apprehensive of statistics largely because he does 
not understand the technique which statisticians apply to a 
problem in which he is interested. Statisticians and education- 
ists are doing their bit to make up this deficiency. 


EXERCISES 

(1) . Explain and illustrate the functions of statistics. 

(2) . Discuss fully the importance of statistics as an aid to 
commerce. 

(B. Com. Alld. 194^2). 

(3) . Discuss the importance of statistics for social phenomena. 
How far do you agree with the statement that planning without 
statistics cannot be imagined ? 

(4) . Write an Essay on — 

Either, (a) Statistics in the service of the State, 
or, (b) Collection of economic statistics during a 
population census. 


(Madras, Dip. in Econ., 1981 ). 



30 


statistics: theory and practice 


(5) . ‘A knowledge of statistics is like a knowledge of foreign 
language or of algebra: it may prove of use at any time under any 
circumstances — Explain. 

(6) . Evaluate the importance of the study of statistics at 
the present time in India. 

(7) . Correct statistical information is as essential for a 
plan for the welfare of mankind as correct diagnosis is for the 
successful treatment of a chronic disease. — Explain this statement 
with necessary comments. 

(8) . ‘ The Statistics of a business can be treated scientifically 
and the preparation and study of business statistics may be made 
a more exact science than the study of national and social statistics.’ 
Explain. 

(B. Com. Alld. 1982). 

(9) Explain clearly the statistical methods used in any 
scientific investigation, and show their importance to theoretical 
economists and practical businessmen. 

(B. Com. Alld. 1933). 

(10) Give a lucid explanation of limitations of statistics. 

(11) ‘‘There are three degrees of comparison in lying. 
There are lies, there are damned lies, and there are statistics ” — 
How far do you agree with this .statement.^ 

(12) How do you reconcile the following statements.^ — 

I. (i) Statistics can prove anything. 

(ii) Statistics do not prove anything. 

II. (f) Statistics are lies. 

(it) Figures do not lie. 

(13) Discuss the scope, utility and limitations of statistics. 

(B. Com. Agra, 1937). 

(14) The claims of statistics to our support depend upon the 
efficient mental training it provides for the citizens, the light it 
brings to bear upon many important social problems, and increased 
comfort it adds to practical life — Discuss. 



FUNCTIONS AND IMPORTANCE OF STATISTICS 31 


(15) Indicate the usefulness of statistics to the state, legis- 
lators, bankers, businessmen and economists. 

(16) Give the important uses and limitations of statistics. 
Show its relation to Economics and Mathematics. 

(B. Com. Luck., 1938). 

(17) In what ways can statistical methods be misused by 
interested persons? Give at least two examples of the misuse of 
statistics. 

(B. Com. Luck., 1939). 

(18) Discuss the importance of the study of statistics, and 
show how it can help the extension of scientific knowledge, the 
establishment of a sound business, and the introduction of social 
and political reforms. 

(B. Cora. Agra, 1942). 

(19) ‘ Sciences without statistics bear no fruit. Statistics 
without science has no root ’ — Comment. 



CHAPTER IV 


STATISTICAL INQUIRIES AND UNITS 

111 orfjaiiisiiig a statistical inquiry it is at first montial 
to ascertain the object of the inquiry, since thf type of inquiry 
to be undertaken and its details will be lar^rely deterniined by 
the li^ht which it is the purpose of the inquiry to throw. 

Types of Statistical laqtiiries. 

An appreciation of the different types of statistical 
inquiries is necessary, for the iiieaniiig, scope and accuracy of 
statistical data and the method of collectinf>' requisite informa- 
tion are dependent upon the type of inquiry in hand. 
Distinction between .stati.stical inquiries can be made upon 
answer to the question — ‘ By whom is statistical information 
required?’ It may be required by the state, a business house 
or a scientific investigator. Their facilities for collection of 
data differ. The state may legislate, an imstitution may 
request while a private individual may beg for the purpose. 
The sum of money that everyone of them can spend on the 
inquiry is different, individual’s financial capacity being the 
weake.st. Again, offi'cial. commercial and scientific inquiries 
will look at the same subject-matter dift‘erentl,.v ; facts material 
to one cla.ss of investigation will not be relevant to another. 

Another distinction between statistical inquiries can be 
made according to how the statistical information emergfes. 
Figurative data may emerge as by-products of certain 
administrative operation or they may be obtained directly by 
collecting information relating to certain affairs. In the first 
case, collection of data is not the primary purpose, but subsi- 
diary to or only a part of the main action. For instance, 

32 



STATISTICAL INQUIRIES AND UNITS 33 

imports into India are recorded at the customs office, and these 
records serve as the raw materials of statistical tables. In the 
second case, the collection of data is the sole end. For 
example, census of population lyields the figures which it is 
the purpose of the census to collect. Obviously, the degree of 
accuracy attainable in the results of the first type of inquiry 
is not likely to be higher than that in the results of the second, 
that is an enquiry ad hoc. Again, in the first case, the defini- 
tions of the terms used shall be so designed as to suit 
administrative needs, while in the second they shall suit the 
purpose of the particular problem. For instance, the term 
' wage ^ may mean * money wage ’ to those administering 
sickness insurance scheme, and ‘ real wage ^ to wage-earners 
claiming de.arness allowance in times of rising prices. 

Again, statistical inquiries may be distinguished according 
to the source from which the information is obtained. In some 
inquiries the number of i>ersons playing an influential part 
may be large, in others small. For administering a scheme of 
food-rationing in a town, every householder may be held 
responsible for the information relating to persons and grain 
consuming animals in his household. The number of house- 
holders is no doubt very large. Therefore, the sources from 
which information is obtained are varied. The questions 
contained in the form will, therefore., have to be few, simple 
and unambiguous because of the varying educational standards 
of the people. The scope of the inquiry would naturally be 
limited on this account. If, on the other hand, sources arc 
few compared to the size of the inquiry, these few may be 
skilled investigators appointed to collect requisite information. 
The scope of an inquiry can be extended here, because the 
investigators can elicit the information which, in the former 
case, may be difficult to extract. 

Another distinction between different kinds of statistical 
inquiries may be made by using the words Census and Sample 
In the census the whole group is surveyed, for instance, the 
• F.— 3 



34 


statistics: theory and practice 


Census of Population or accounts of Foreign Trade of India. 
In the sample only a part of the group is surveyed, for example, 
a sample survey of acreage under jute in Bengal. 

Statistical inquiries ma,y also be distinguished as direct 
and indireict. Height of students in a class is measurable 
directly in inches, but their intelligence cannot be quantita- 
tively ascertained. In cases where the desired infoi’iiiation is 
not capable of statistical treatment, some allied info?‘mation 
reducible to numerical standards will have to be collected. 
In this particular case reliance will have to be placed on the 
marks obtained at a certain examination or intelligence test. 
This inquiry is indirect. 

Statistical inquiries may be original or repetitive. They 
may either be carried on for the first time or in continuation 
or repetition of previous inquiries. In the former case a plan 
will be initiated. In the latter, old plan, with such minor 
alterations as experience or necessity demands, may be 
followe<l. But the definition of units used should not be 
materially altered in the I’epetitive inquiry. Advantages 
resulting from modifying the old plan and of continuity and 
comparability of information must be weighed before effecting 
any alteration in the plan. 

Lastly, statistical inquiries, may be undertaken for 
absolutely confidential purposes or they may l)e thrown open 
to public. Trade associations may collect information from 
their members which may be kept secret. Modes of treatment 
for both types of inquiries will not be identical. 

nnits of Measurement. 

Having ascertained the purpose for which statistical data 
are to be collected and used, and having formulated the type 
of which the inquiry will be, the next step in organizing 
statistical studies is to define., rigidly and unmiistakably, the 
units of measurement in which the aggregates to be counted 
shall be expressed. Quantitative science demands a precise 



STATISTICAL INQUIRIES AND UNITS 35 

and unambiguous terminology, for this terminology, once 
specified, shall be adhered to throughout the inquiry. 
Adherence to the definition once made is essential in order 
that the thing counted or measured may be the same through- 
out the inquiry. Strict comparison shall be possible only 
when the things counted are the same. The task of defining 
the unit seems at first easy, but in many cases the opposite is 
true. Literacy connotes one meaning to an ordinary person, 
another to a sociologist, but for understanding the tables 
relating to it in the Indian Census Reports its meaning is some- 
thing precise — ability to write a letter and to read the answer 
to it. In studying the problem of ‘ educated unemployed ’ in 
India, the questions that at once arise are: what is exactly 
understood by * educated and, who is ‘ unemployed 
Upon a little thought it will be' found that it is not easy to 
answer such simple questions. Similarly factors like wages, 
profits, accidents, imports, investments are differently inter- 
preted by different people. Correct definition is altvays determined 
by tJie purpose in mind. Different purposes will necessitate 
different definitions of the same unit. But before collection of 
data begins a correct specification of the unit will have to be 
made. Following observations are useful for the purpose: — 

1. The unit must suit the purpose of the inquiry. 

2. Its definition must be unambiguous, simple and 
complete in itself. 

3. The unit must be definite, specified and ascertainable. 

4. The unit must be stable and standard. (In India, 
cuiiency fiuctuations have not been rare, and weights and 
measures still vary from locality to locality. Hence the 
necessity of taking a stable and standard unit.) 

5. Homogeneity and uniformity must be ensured. A unit 
should not imply different characteristics at different times. 
If the data are heterogeneous, they may be broken up into 
small classes to secure uniformity, or the process of 



36 


statistics: theory and practice 


standardization may be followed. For example, the data for 
the compilation of an average of the wages received by 
workers in a factory, where male and female adults and 
children are working side by side, are heterogeneous, women 
getting lower wages than the meji, and children getting the 
least. Jn order that the average wage may be a true re- 
presentative, the data may either be sub-divided into three 
groups, viz., ^ wages for male adults \ ' wages for female 
adults ’ and ' wage^ for children,’ or females and children 
m^ be expressed in terms of equivalent men. 

Simple and Composite Units, and Co-efficients. 

The units of measurement may be classified into : 

J. Units of Enumeration or Estimation, and 

IT. Units of Analysis and Interpretation. 

The first are those in which measurements are made. They 
are concerned with collection of data. The second are those 
in which data are compared. They are concerned with 
comparison of data. 

The first are either simple or composite. A simple unit 
is one that denotes a combination of characteristics that occur 
together. It simply distinguishes classes. Its meaning is 
general. Examples of simple units are a ton, a passenger, an 
accident, a sale, a store, a house, a room. Such units are 
mutually exclusive. They are defined easily aiid fairly 
precisely. The degree of error associated with them is, 
therefore, small. 

A composite unit is one that is formed by adding a limit- 
ing or qualifying word or phrase to a simple unit, with the 
i-esult that its scope becomes limited and the task of defining 
does not remain easy. Examples of composite units are ton- 
mile, passenger-mile, industrial-accident, credit-sale, chain- 
store, bond-house, sleeping-room. Since composite units 



STATISTICAI. INQUIRIES AND UNITS 


87 


present greater diffi'culty of definition than the simple units 
do, chances of error coming in increase. 

The second — units of analysis and interpretation— include 
ratio, or what is called co-efficient. They are used for 
comparison. To compare, things must be placed in relation to 
each other. To do this co-effi'cients are the best to employ. 

A co-efficient takes the form of comparative statement. 
Comparison may relate to time, to space and to conditions in 
time or space. For example, wages may be expressed in 
Rupees, but related to days or months. We, then, speak of 
Rupees per day or per month. We may express production of 
wheat in maunds but relate it to province, farm or acre. AVe, 
then, speak of mauiids per farm, or acre, or province. We 
may, lastly, express deaths in a given area in numbers, but 
relate them to the entire population. Then we speak of death 
rate per thousand or so. A co-efficient, in effect, is a com- 
parison between the numerator and the denominator, ])oth of 
which should be related and homogeneous. If passenger-miles 
are divided by passenger-train-miles we shall obtain passengers 
per train. But if passenger-miles are divided by ton-miles a 
monster will result. 

Different units yield different information and they should 
be selected in the light of the purpose in view. After the 
units have been selected and defined, the next step in planning 
a statistical inquiry is to determine its scope. Every phase of 
questions should be carefully studied and details checked. No 
effort should be spared to minimise the chances of error. The 
aim should l)e to avoid the necessity of conducting a second 
inquiry and thus save time., labour and money from being 
wasted. Then a suitable method of collecting statistical data 
will have to be selected. 



88 


statistics: theory and practice 


EXERCISES 

(1) Explain with examples the different types of statistical 
inquiries, and indicate the bearing of each on the collection of 
data. 

(2) What do you understand by the 'Object' of enquiry? 
Is it necessary to determine it before planning a statistical 
inquiry ? Why ? 

(3) What are statistical units of measurement? Explain the 
necessity of determining them. 

(4) Differentiate between simple and composite units. Give 
illustrations of transforming units from simple to composite. 

(5) What difficulties are experienced in defining the follow- 
ing terms for collecting statistical data? 

Accident, Industrial accident. Room, Class-room, Hindu, 
Exports, Literacy, Book, Improved variety of crop, wage. 

(d) Differentiate with examples the units of measurement 
from the units of comparison. 

(7) What is a coefficient? Illustrate with suitable examples. 

(8) What precuations should be observed in specifying a unit? 

(9) What preliminary steps will you take in planning a 
Statistical Inquiry? 



CHAPTEK V 


COLLECTION OF STATISTICAL DATA 

Primary and Secondary Data. 

Statistical data are generally classified as primary and 
secondary. The former arc those which form the raw material 
of inquiry, while the latter are those which have gone through 
the statistical machine at least once. The former are original, 
i.e., those in which instances have been recorded as they 
occurred without having been grounded at all. The latter are 
those that have been worked up to a certain extent, i.e., they 
have been collected, tabulated and presented in some suitable 
form for any purpose. They are generally expressed in totals, 
averages and percentages. 

Distinction between primary and secondary data is one 
of degree. Data which are secondary in the hands of one may 
become primary in those of another. For instance, statistics 
of foreign trade of India are secondary data to the general 
public while they are primary data in the hands of the 
statisticians of the Department of Commercial Intelligence and 
Statistics. The di.stinction betw'een them lies in the fact that 
when figures have been ‘ vv’orked over ’ for a purpose, when 
they have been examined for their accuracy and comparability 
and have been grouped, averaged or reduced to percentages — 
that is. when they have lost their individual characteristics 
which they possessed when they were repoi-led— they become 
secondary data. 

On the basis of primary and secondary data the methods 
of collecting statistical material have been divided into 
Primary Method and Secondary Method, the former collecting 

30 



40 


statistics: theory and practice 


original data, while the latter collecting such data as have 
already been “ worked over to some extent. 

Primary Method. 

Under this method the following ways of collecting the 
requisite data are generally used. 

1. Direct Personal Observation. This method yields very 
accurate results for it implies that the investigator must be 
present on the spot to make patient and careful personal 
observation regarding how people work and live. First-hand 
information collected thus must be reliable. Rut, since within 
a reasonable amount of time an extensiAT field of inquiry 
caniiol be covered by this Jiiethod, it is useful specially for 
intensive studies. It is, therefore, utilized in localized 
inquiries. Besides covering only a narrow field, this method 
is open to the charge that the chances of personal prejudices 
of the investigator affecting, even unconsciously, his conclusions 
are great. 

If it is not practicable for the investigator to be on the 
spot or to devote the time the above method demands, an 
alternative for the investigator is to question and cross-examine 
a person who is directly in touch with the facts under investi- 
gation. Here, since the investigator counts on the goodwill 
of others, he will have to be courteous in his behaviour. 
Besides the (questions that he would ask must be few, very 
simple, clearly worded, not inquisitorial and so far as possible 
demanding an ansAver in ' yes ’ or ‘ no ' or ' a number \ This- 
alternative method is also used in intensiAT studies. 

2. Indirect Oral InArestigation, assisted by a standard list 
of questions. When the information desired is complex and 
informants are indifferent to supply it if directly approached, 
or if the field to be covered is very extensive so that the first 
method cannot be successfully employed, the viva-voce indirect 
evidence of several third p;Arties, preferably of those indirectly 
in touch Avith the facts under inquiry, may be recorded. 



COIJ.ECTION OF STATISTICAI. DATA 


41 


Commissions and Enquiry Committees appointed by govern- 
ments generally find this method suited to their needs. But, 
certain precautions must be observed in order that reliance 
may be placed on the data collected. Firstly, the indirect 
evidence of one person should not be relied upon. Secondly, 
it should be clearly ascertained v^h ether the informaTit really 
possesses a knowledge of the full facts. Thirdly, it should be 
considered whether the person questioned is prejudiced in 
favour of or against a particular viewpoint or is motivated to 
colour the facts. Due allowance must be made for the 
optimism and pessimism of the informant. Lastly, if the 
informant happens to be an un-educated person or suffering 
from occasional fits of mental disequilibrium, it should be seen 
whether he would be in a position to give expression to his 
ideas adequately and precisely. 

3. Estimates from local sources or correspondents. This 
method does not imply a formal collection of data, fjocal 
correspondents obtain the estimates in their own manner and 
report them to an appointed authoi*ity. This method yields 
only approximate results, but expeditiously, at a small cost 
and with ease. 

4. Investigntioai through schedules to be filled by the 
Informants. This method differs from the preceding one in 
that the questions asked of the informants are those ui respect 
of which they are supposed to have definite and precise info!*- 
mation. If the informants reply intelligently, this method is 
good for extensive inquiries. It is inexpensive and fairly 
expeditious. This plan is largely adopted l)y private indivi- 
duals and even the government. Jfut a large ruimlier of 
informants do not generally answer the schedules unless the 
mquiry is in the informants' own interest, or the private 
individual or institution responsible for the inquiry is able 
to persuade them to answer, or the state exercises its legislative 
powers for the purpose. And, schedules that are returned are 
very often incomplete, ambiguous and full of eiTors, since the 



42 


statistics: theory and practice 


average informant is indifferent in these matters. In order 
that correct answers may be had the schedule, or a letter of 
request attached to it, should state the purpose of the inquiry, 
the identity of the person or the institution responsible for the 
inquiry and an assurance to treat the information tendered as 
confidential, if so desired by the informant. The questions 
should be few, clearly phrased and above all simple, and 
should not be such as to arouse suspicion and prejudice in the 
informant. 

5. Investigation through schedules in charge of Enumera- 
tors. Under this plan the informants themselves are not to fill 
in the schedules. Instead, the trained enumerators put them 
questions and record their answers. Therefore, in this kind of 
inquiry the schedules can be much more exhaustive than in 
the previous one, and the scope of inquiry can also be enlarged. 
Correct interpretation of every question and the method of 
collecting information must be explained in detail to the 
enumerators, so that different enumerators may not give 
different weight or meaning to the same questions. Enumera- 
tors should be equipped with a sample schedule duly filled. 
This plan affords quite good results and is the best for many 
extensive investigations. It is generally adopted in large- 
scale governmental inquiries. Its cost is, no doubt, prohibitive 
to a private individual or institution. 

Clioice of Eammerators. 

Selection of enumerators should be done with great care 
since on them depends the quality of the investigation. 
Intelligence, diligence and integrity must be their attributes in 
order that vague replies of informants may be detected, 
corrected or eliminated and fictitious quantities may not be 
entered. The enumerators should also be courteous and tactful 
so that they may extract the requisite information without 
causing resentment or ill-will in the informants. They should 
be free from bias. If these precautions are taken in the 



COIJ.ECTION OF STATISTICAL DATA 


43 


selection of enumerators, needless errors and eonfusioji shall 
be avoided. 

Choice of Questions. 

A word about schedules used is necessary. The schedules 
may be either what are called ‘ Questionnaires \ the answers 
to which are recorded on a separate piece of paper, or ^ Blank 
forms,’ in which space is pi’ovided in the form itself for filling 
in the reply. Headings and titles drawn up in them should 
be lucid and easily understandable, and the degree to which 
accuracy of a numerical result is required should be indicated*. 
The size of the paper should not be unwieldy, and each word 
and phrase used should be carefully scrutinized for ambiguous 
or controversial interpretation. 

The (jiiestions which are asked should be — 

(1) such as the informant shall be able to answer, 

(2) few in number, 

(3) simple and clear enough to be easily grasped, 

(4) not inquisitorial and not causing resentment so far 

as possible, 

(5) requiring brief answer, say ‘ yes ^ no ’ or 

‘ a number 

(6) corroboratory if possible, 

(7) capable of being answered without prejudice, 

(8) directly related to the point of information desired. 

Selection of Representative Data. 

When the inquiry in hand is very extensive, it will not be 
practicable to undertake a census type of inquiry where each 
individual . item of the universe or ‘ population ’ shall be 
questioned. The inquiry will have to be of the type of sample 
survey, and the sample will be representative of the whole 
field. For example, in a census inquiry we may establish the 
facts about the heights of 2,000 people by finding averages and 
other statistical indices: our problem shall, then, be limited to 



44 


statistics: theory and practice 


a characterization of the heights of these 2,000 people. But, 
in a sample survey we shall use the properties of a random 
sample of variables for drawing inferences about the larger 
population from which the sample is drawn. For example, our 
question will be: what approximate or probable inferences 
may be drawn regarding the stature of a whole race of people 
fro4n an analysis of the heights of a sample of 2,000 people 
drawn at random from the people belonging to that race? 
Thus, in a census inferences for the whole population are drawn 
from a study of the whole field, while in a sample survey 
infei-enees for the whole population are drawn from the study 
of a representative part. 

The methods of selecting a sample or representative data 
are two: Deliberate or Purposive Selection and Random 
Sampling* or Chance Selection, In the former method the 
investigators deliberately choose the particular units, since 
they feel that the small mass that they select out of a huge 
one is typical or representative of the whole. If economic 
conditions of people living in a province are to be studied 
according to this method, a few towns and villages may be 
deliberately selected for intensive study on the principle that 
they shall be typical or representative of the entire province. 
l»ut they may not always be typical, since personal element 
has a great chance of entering into the selection of the sample. 
The investigator may select a sample which shall yield results 
favourable to his point of vi^w and the entire inquiry may be 
vitiated. If the investigators be unbiassed, the results obtained 
from an analysis of deliberately selected sample may be 
tolerably reliable, provided the basis of selection is otherwise 
unquestionable. 

Random sampling, on the other hand, is free from the 
influence of the personal factor. It is, so to say, a lottery 
method in which individual units are picked up from the whole 
group not deliberately, but by some mechanical process, so that 
every unit has equal probability of entering the sample. Here 



COLI.ECTION OF STATISTICAJ^ DATA 


45 


it is blind chance alone that determines whether the one unit 
or another is selected. This is an important condition of this 
method. If from a list of villages of a given area, arranged 
in alphabetical order, every lOOth or 50th village is marked 
out for intensive study it would give a hundredth or a fiftieth 
sample of the whole area. Or, in urban areas, from a directory 
of the owners of shops in a particular locality every 10th -or 
20th shop may be selected for intensive study. This type of 
random sample rural and urban surveys have been suggested 
by Bowley-Robertson Committee for India. It should, however, 
be noted that once a random selection of villages aiid shops 
has been made, on no account should any one of the villages 
and shops be substituted by another. 

The ^ Sample ’ method has many advantages over the 
‘Census^ method. It economises in time, labour and money and 
permits a small hand of skilled investigators to do the whole 
job more efficiently and precisely than a large army of un- 
willing, ineffi'cient enumerators would do in a census. Random 
sample survey is largely coming into use because of these 
advantages and of its scientific character. It affords a suffi- 
ciently accurate picture of a large group without resorting to 
a complete enumeration of all the units of the group. 

The method of random sampling is based on the 
mathematical ‘ Theory of Probability ’. The theory implies 
that if from a very large group of items, technically called the 
‘ population \ a moderately large number of items is chosen 
at random, such numbers are almost sure, on the whole, to 
possess the characteristics of the population. If of two men 
each plucked 200 leaves of a particular tree, the average of the 
lengths of the leaves plucked by each man w-ould be almost 
identical, even though the leaves varied considerably in size. 
Further, if one were to obtain the average length of all the 
leaves of the tree, it would not materially differ from the aver- 
age length of either group. Similarly, if a rupee coin is tossed 
twelve times, the probability is that it will fall half times 



46 


statistics: theory and practice 


(i.e., six tim€s) with its head or tail turned upwards. On 
this principle gamblers — dice-throwers, card-players etc., — 
run risks continuously and with profit, and insurance 
companies insure against death or other calamities. It 
is this principle that isf responsible for regularity in the 
number of crimes and of suicides in a country for a given 
period of time. This principle is christened the Law of 
SkUisUcal Regularity. 

It should, however, be noted that any number of samples 
will not give exactly the same results as a study of the whole 
group would. As a matter of fact, the probability of error 
diminishes with an increase in the number of items included in 
the sample. That is, the larger the sample, the more reliable 
are its indications. Its reliability is proportionate to the 
square-root of the number of items included. 

The Law of mertia of large numbers furnishes a corollary 
to the law of statistical regularity. It implies that large 
numbers are relatively more stable than small ones. If one 
part of a large group varies in one, direction, the probability 
is that another equal part of the same group would vary in 
the opposite direction so that the total change would be slight. 
P^or example, the production of tea or wheat may vary from 
locality to locality in a given year, but the total world produc- 
tion of tea or wheat remains relatively stable for decades. 
l>eaths in different parts of a country in a given year may show 
violent fluctuations, but all fluctuations will hardly be in the 
same direction, so that death-rate for the whole country will 
remain almost constant through a number of years. Thus 
large numbers and the averages deduced from them have 
great inertia. 

But it does not mean that the property of inertia does 
not allow for changCi with the passage of time. It only 
signifies that if the numbers under consideration are of great 
magnitude, the change is likely to be morci regular than in 
cases where small quantities are considered. Secular move- 



COLLECTION OF STATISTICAL DATA 


47 


meoits resulting from long period tendencies in the background 
of conditions are not precluded. The death-rate of a country 
ma;y be relatively stable from year to year, but for a long 
period it may show a progressive decline. 

Secondary Method. 

Data already collected by others may be in manuscript 
form, for example, original records of (a) business houses or 
of (b) government offices such as accounts of business firms 
and public offi'ces, patwari’s villagCrbooks etc., or (c) notes of 
past investigators and chroniclers. They may be type-written 
or printed matter. Printed documents include books, journals, 
reports, bulletins and official publications. They may be 
published or meant for private circulation only. They may 
be original or derivative. They may be by-products of ad- 
ministration or meant for public information. They may be 
official or non-official. 

The following different ways of compiling secondary data 
for statistical inquiries may be noted: — 

1. Utilizing puiblizhed information. Shich information 
may be — 

(a) official i.e. published by Government Departments, 

Royal commissions etc., 

(b) Semi-official, e.g. published by municipalities, 

railways etc., 

(c) published by Technical and Trade journals, 

(d) published by trade associations, Chamf)ers of 

Commerce etc., 

(q) published by Research Institutions such as univer- 
sities, Economic Enquiry Boards, 

(f) published by Individual research workers. 

2. Utilization of Bindness InteUigence Service bulletins, 

e.g. daily, weekly or monthly bulletins, market reports issued 
by Stock Exchange or Produce Exchange and dealers of 
repute and standing. 



48 statistics: THEOKX^AND phactice 

3. utilization of impubliBlie^data or manuscripts. 

4. Utilizin^f information collected by othi^ a^ndes Or 
for other purposes. 


EXERCISES 

(1) How will you organize an economic survey of a small 
Indian State comprising five towns and one lliousand villages. 

(M. Com. Alld., 1943). 

(2) How far do the results of statistical investigation depend 
upon correct sampling.^ Compare the different methods used to 
secure representative data. 

(3) Explain in detail how you would proceed to organize a 
‘ Census of Wages.’ Draw up a blank form or forms to obtain 
the information required, 

(B. Com. Agra, 1937). 

(4) Compare the advantages and disadvantages of the 
‘Census’ method (or complete enumeration) and the ‘Sample’ 
method of collecting statistics. 

(B. Com. Cal., 1937). 

(5) ‘Although the personal observation method is the best 
it is not possible to adopt it in many cases ’ — Discuss. 

(6) What is a Questionnaire? How does it differ from a 
Blank Form? What precautions should be taken in drafting a 
questionnaire ? 

(7) Draft suitable questionnaires for enquiries regarding: 

( 1 ) Educated unemployment in India. 

(2) Sugar industry in U. P. 

(3) Economic condition of agriculturists in India. '' 

(4) Cost of' Living of a University staff. ' - 

(5) Budgets of students' in a college. 

(6) Industrial surve'^Jr of U. ’P. 



COUJECTIOK OF STATISTICAL DATA 


49 


(8) How would you oi^anifte an investigation into the hand- 
loom weaving industry of the U. P. ? Prepare questionnaire suitable 
for the purpose. 

(B. Com., Alld., 1942). 

(9) It is required to estimate the total consumption of food- 
grains in the U. P. for enforcing a scheme of food-rationing. 
What statistical data should be collected for the purpose, and how? 

(10) Show the necessity of the use of the method of random 
uunpling in any extensive investigation. How would you make 
use of this method in carrying out an economic survey of the rural 
areas of U. P. ? 

(B. Com., Alld., 1985). 

(11) Statistical investigations carried out by the (xovt. are 
usually based either on complete enumeration of the universe of 
reference as, for instance, the population census, or on the study 
of ‘ typical ’ cases as, for instance, the proposals regarding the 
economic census. Explain why the method of random samples 
is to be preferred to either of these methods. 

(M.A., Alld., 1936). 

(12) Write brief, but lucid, notes on: 

(o) Law of Statistical Regularity, 

(b) Law of Inertia of Large numbers, 

(c) Primary and Secondary methods of collecting data, 

(13) Why are methods of collecting statistical data termed 
as Primary and Secondary ? How will you follow the latter method 
in an inquiry? 

(14) What difficulties are experienced in collecting informa*' 
tion about the following and how can thc}^ be overcome? 

Family Budgetsi, scatteredness and smallness of holdings, 
acreage under wheat in U. P., Indebtedness, Labour Conditions, 
Intelligence. 

(15) Explain the merits and demerits of distributing the work 
of collection of statistical data among a group of investigators. 
What essential qualities should be looked for in an investigator? 

F.~-4 



50 


statistics: theory and practice 


(16) Explain the whole process of organising an enquiry into 
the system of agricultural marketing in the U. P. 

(17) Differentiate l)etween Primary and Secondary data. 
Give suitable examples. 

(18) A cotton manufacturer in Bombay is anxious to find 
new markets for his goods in India and foreign countries. What 
statistical materials should he collect? What material would he be 
able to get from published documents? 

(B. Com., Alld., 1935). 

(19) A sugar manufacturer in the U. P. is anxious to find 
new markets for his sugar outside India. Describe the procedure 
he should follow to get all the necessary statistical information for 
the stxecess of his mission. 

(B. Com.. Luck., 1938). 

(20) What are the different methods employed in collection 
of data for statistical inquiries? In what types of inquiry should 
each of them be used? 

(21) If you are required to study labour conditions in an 
industrial town, explain what you, as an investigator, will do. 

(22) How will the nature of questions to be put to the in- 
formants differ with different methods of collecting statistical data? 

(23) What do you understand by ‘ corroboration from inde- 
pendent sources’? Explain its necessity with suitable examphis. 

(24) How would you make use of the method of ramdom 
sampling in an economic survey of urban and rural areas in the 
U. P.? 

(25) Will you employ the random sample method or deli- 
berate selection method in conducting provincial inquiries relating 
to the following problems? — 

(fl) Acreage under food-crops. 

(6) Brassware Industry’s survey. 

(e) Output of Klmndsari sugar. 

(d) Carpet-weaving. 



CHAPTER VI 


EDITING THE COLLECTED DATA 

Editing Primary Data. 

After tke schedules have been returned by the informant^ 
or enumerators, as the case may be, they should- be edited i.e. 
scrutinized to detect errors, omissions and inconsistencies. If 
possible, defective schedules may be sent back for correction,- 
or if the investigator has reasonable ground to do so, he may 
himself make the required amendments. Undue tampering 
with them is, however, dangerous. Only in cases of unmistak- 
able error should alterations or modifications be made; other- 
wise, even with a will to be impartial, a wrong, fallacious 
conclusion might result. Such schedules as are thoroughly 
unsatisfactory must be rejected. For, smaller number of 
correct samples is better than a large number of incorrect ones. 
In the former case, the error can be mathematically corrected 
with approximate accuracy. It is not possible, in the second. 
If majority of informants have misunderstood a question there 
is a clear case for making a change and conducting a second 
enquiry. The extent to which omissions may be allowed is 
also of importance. If the returns unquestionably confirm a 
cei-tain fact and the samples are tolerably representative, the 
omission of a number of returns does not matter. If, on the 
other hand, evidence is conflicting, the omission of even one 
return may be a serious matter. The degree to which lack of 
accuracy or presence, of errors or approximations are to be 
tolerated in editing the data is of great significance. 

Accuracy. 

Perfect accuracy in the data is rarely attainable. Wheat 
crop in India, for instance, cannot be exactly measured. It 

51 



52 


statistics: theory and practice 


can be e^sttiiiiated to a reasonable degree of accuracy. A 
weighnian, however perfect his wei^bin^’ balance, carinor weigh 
wheat or any oth‘e,r commodity correct to within, say, l/64th 
of a seer. Similarly the distance for a four-mile cross-country 
race cannot be measured without giving a probable error of a 
few yards. Even in scientific measurements absolute accuracy 
is unattainable, for heights of liquids in test-tubes may vary 
by a thousandth part of an inch or angles may differ by a 
hundredth part of a degree. Thus, absolute exactitude is not 
possible; a closer approach to it is possible. Reasons for it 
are obvious: the observer and the observing instruments are 
both sources of error; statistical data cannot be given hard 
and fast definitions; the statistician, unlike a chemist, cannot 
experiment, conditions being outside his control. Statistical 
methods do not, therefore, aim at arithmetical precision. A 
statistician would be satisfied with reasonably accurate figures 
provided he can measure their reliability. To attempt to 
obtain the greatest possible degree of accuracy is to waste 
time. Statistics has, thus, to deal with estimates and probabili- 
ties and not exact enumerations. 

What is a reasonable degree of accuracy shall be deter- 
mined by the nature of the material and the purpose of its 
measurement. In common use, only a certain conventional 
accuracy is required. Precious metals or drugs are much more 
minutely weighed than hay, husk or saw-dust. Height of a 
room may be measured correct to an inch, but the difference 
of half an inch in the length of a man\s nose will make him a 
monster. A railway is satisfied if a parcel for booking is 
weighed correct to a seer, but the post-office would weigh it 
correct to a tola at least. We do not care to know the popula- 
tion of India within 100, nor the revenue or expenditure of 
the government within 1,000. It would be enough if we can 
estimate to that degree of accuracy which is required for 
practical purposes. Thus, it is relative and not absolute 
accuracy tbat is desired in statistical data. 



EDITING THE COLLECTED DATA 


58 


Statistical Errors. 

The word ' error * is used in a special sense in statistics. 
It denotes the difference' between the estimate of a quautity 
and its true value. It differs from a mistake in that it refers 
to a difference resulting from any source of inexactitude. 

The chief sources of errors are three: First, errors of 
origin, e.g., a prejudiced information or inappropriate defini- 
tion of units; second, errors of inadequacy, e.g., inadequate 
sample data or incomplete information; third, errors of mani- 
pulation, e.g., unconscious error in measuring, weighing., 
counting or approximation. 

Measurement of Error. 

Errors may be measured as absolute or relative. An 
investigator is concerned with relative error more than with 
absolute one. Absolute error is the difference between the 
true value and estimate of a quantity, while relative error is 
the ratio of the absolute error to the estimate. If the monthly 
average expenditure of students in a hostel was in reality 
Rs. 50, and we measured it as Rs. 49, the absolute error is 
Rs. (50-49), i.e. one rupee, while the relative error is 

^ A =:Rs. .0204. The relative error is sometimes 
49 49 

expressed as a percentage error. Thus, the relative error in 
the above case is Rs. 2,04 per cent. The error is positive since 
the true value exceeds the estimate. If, on the other hand, 
we estimated the monthly average expenditure as Rs. 51, the 
absolute error is Rs. (51-50) i.e. minus one rupee, and the 
relative error is, 

Rs. (50-51) -1 

~ = ~*1.96 per cent. 

Rs. 51 51 

The error is negative since the true value is less than the 
estimate. 



54 


statistics: theory and practice 


In algebraic notation, let u represent the measurement of 
a quantity whose true value is and e stand for relative 
error of the estimate, then 

c= , 

u 

and, if ue stands for absolute error, 

ue = u^ — u . 

The error would be positive or negative according as is 
greater or smaller than u. 

Biassed and Unbiassed Errors. 

Errors may also be classified as biassed and UTi])iassed. 
Biassed errors result from a bias or prejudice on the part of 
the informant, enumerator or measuring instrument. These 
errors, therefore, lie in the same direction and are cumulative 
in character. That is, the greater the number of observations, 
the greater would be the absolute error. For example, if wall- 
paper were measured with a foot-scale half an inch short, 
greater the number of feet measured, greater would be the 
absolute error. Unbiassed errors are those which arise auto- 
matically, without any bias or prejudice. They are subject 
to the law of statistical regularity, so that excess in one 
direction is almost balanced by defect in the other. Unbiassed 
errors, therefore, are compensating. The larger the number 
of items, smaller will be the absolute error. If the foot-scale 
in our example be correct, error in a measurement in one 
direction shall be compensated by error in another measuie- 
ment in the opposite direction, so that the greater the number 
of facts measured, the smaller will be the difference between 
actual length and the length measured by the scale. 

If some investigators carry on investigation into the 
economic condition of agriculturists in a few places in, say, 
the U.P., in India, pre-determined to prove that their income 
is high they would, probably by examining only the well-to-do 



EDITING THE COLLECTED DATA 


55 


and debt-free cultivators and taking the incomes of those 
jiving near the grain markets, having their own pack animals 
for transporting grain, and marketing their produce them- 
selves, produce a high average income for each locality. But, 
if they were not prejudiced by a pre-determined c-onclusion, 
that is, if the inquiry was impartial, the investigators are as 
likely to make a low estimate in one locality as to make a high 
one in another. In the fonner case, errors are biassed, all 
being in the same direction and causing the average to go high. 
In the latter case, errors are unbiassed, positive ones neutraliz- 
ing the negative ones and reducing the resulting errors to* a 
small figure. The following illustration would clear the point. 


Average monthly 1 
income in 

Fai-t 

1 

Biassed 

Estimate 

j Unbiassed 

1 Estimate 

i 


Rs. 

Rs. 

: Rs. 

Locality A. 

20 

21 

i 22 

Locality B. . . 1 

16 

17.5 

i 35 

Locality 0. 

22 

22.6 

21 

Locality D. 

18 i 

1 l«-4 

; 18.8 

Averages 

19 

20 

19.2 

Relative errors 


5% 

1% 

1 


heroin the above table it is clear that the errors of biassed 
estimate are cumulative, while those of the unbiassed one are 
compensating or eoiinterba lancing. Another illustration of 
biassed and unbiassed errors is provided by the age-returns in 
the Indian Census. The fact that ■women generally under- 
estimate their ages causes a biassed error in the average age 
of the population, while the tendency on the part of people to 
return their ages at the nearest round number causes an un- 
biassed error and, on the whole, does not affect the average 



56 


statistics: theory and practice 


a^e of the population materially. Unbiassed errors seriously 
effect the accuracy of the total and should, therefore, be 
avoided. To eliminate the effect of unbiassed errors a large 
number of observations should be taken. So, if the errors 
cancel each other even a considerable degree of inaccuracy mfiy 
be allowed while editing the data, but if they are cumulative 
they shall have to be avoided. 

Approxiimation. 

Numerous digits confuse the mind. They may be ex- 
pressed in round numbers, even though exact figures may be 
available. For example, the figure 264,571 may be expressed 
as 264,570 or 264,600, or 265,000, but not as 264,500 or 264,000., 
or the population of India for 1941 may be expressed as 389 
millions. Here arises another type of error called Possible 
error. If a quantity is rounded off to the nearest 100, the 
upper limit of error is +50 and the lower -“50. The possible 
error is therefore expressed as zh50. That is, possible error 
denotes the upper and the lower limits within which the actual 
error lies. 

In approximation alt figures except the first digit beyond the 
margin of accuracy should be left out. P'or instance, if the 
length of a leaf is recorded as 3.97 centimeters, thougli with 
the scale used it was possible to read correctly to the nearest 
tenth of a centimeter, it is better to retain the final digit 7, 
since 3.97 is a closer approximation to the real length than 4.0 
ems., which would be the reading if the, digit 7 is 
dropped. Again if the last correct figure is a cipher it must be ^so 
entered, ]f a leaf measures very nearly three centimeters, it 
should be expressed as 3.0 cms., and not simply as 3 cins., for 
the latter figure might mean that the reading is accurate to 
centimeters whereas the former indicates that the reading is 
correct to millimeters which really is the ease in our example. 
If the reading is correct to centimeters, any leaf between 2.5 
and 3.5 -cms. in length would be entered as 3 cms., while if the 



EDITING THE COLLECTED DATA 57 

reading is correct to millimeters the entry 3 cms. would mean 
that the length is between 2.95 and 3.05 cms. Similarly, an 
entry of 3.00 will mean that reading is accurate to hundredths 
of a centimeter, and that the length of the leaf is between 
2.995 and 3.005 cms. 

When certain digits of a number are to be dropped, all 
fractions over half should be counted as whole numbers and all under 
half discarded. Fractions equalling exactly one-half maiy be 
allowed to remain or left out at discretion. The following 
table giving approximations should be carefully studied. 



Approximation corree.t to 

Orif?inal Number. 

one decimal place. 

15.049 

15.0 

15.050 

15.1 

15.249 

15.2 

15.257 

15.3 

15.948 

15.9 

15.951 

16.0 

Editing Secondary Data. 



Secondary data should be used only with careful niquiry 
and criticism. It is never safe to take them at their face 
value without ascertaining their meaning and limitations. 
Inquiries relating to the following should be made before they 
are used: — 

(1) The organisation that supplies the data. 

(2) The reliability of the compiler, and his ability to 
procure correct figures. 

If upon these two inquiries it is found that the data are 
not mere guess work, but are sound, the following information 
should be had : 

(3) The scope and object of the inquiry conducted by 
the original compiler. 

(4) The definition of units in which they are expressed, 



58 


statistics: theoey and peactice 


(5) The sources of the compiler’s information, 

(6) The method of his collection, including instructions 
given to the enumerators. 

(7) The degree of accuracy desired and achieved. 

(8) The extent to which they refer to homogeneous 
conditions. 

(9) The suitability of their application to the given 
problem. 

Even when all the inquiries have been thoroughly made, 
there may still be some shortcomings in the data. The 
investigator using the secondary data will then have to 
exercise his oommonsense and experience to use the data. If 
the organisation or compiler’s standing is not well-known, the 
investigator may study such a part of the compiler’s data as 
is familiar to him, and detect if there were any bias or motive 
to manipulate the figures. A business intelligence bulletin, 
for instance, may be very often biassed. Data may be lacking 
in consistency or homogeneity, or may not be suitable to the 
inquiry in the investigator’s hand. At every step, then, 
investigator’s intelligence counts much. Indeed as Dr. Bowley 
remarks: ^ In collection and tabulation commonsense is the 
chief requisite and experience the chief teacher ’. Where 
compulsion was exercised by the government the informants 
may have given information reluctantly and, therefore, not 
precisely. Questions relating to use of intoxicants, mental 
infirmity, total earnings, profits etc. are inadequately answered, 
since they arouse antagonism and suspicion among the 
informants. The agency for collection may have been in- 
efficient, methods defective, accuracy wanting. 

The data may not have been co-ordinated. They may 
have been collected for administrative purposes where a high 
degree of accuracy was not essential. A sifting inquiry is, 
therefore, essential. If co-efficients were computed it should 
be seen whether the numerators and denominators w^ere re- 
lated to each other and were homogeneous. It should be 



EDITING THE COLLECTED DATA 


59 


known whether the quantities were strictly measurable and 
whether they were measured on the same basis. In editing 
secondary data for the purpose of one^s inquiry, the 
investigator must be cautious and careful at every step. The 
best thing to do will be to see whether the details compiled by 
the other organisations or compiler tally, or tolerably agree, 
with the details that would have been followed by the investi- 
gator himself. 


EXERCISES 

(1) What precautions should be taken in approximating large 
figures ? 

Approximate the following figures, expressing them in 
hundreds, so as to show the biassed and unbiassed errors in 
approximation : — 

485,899; 410,902; 415,500; 290,492; 865,432; 399,491; 
450,256; 462,800; 300,099; 295,591. 

(2) Give examples to show that 

(a) biassed errors are cumulative and unbiassed ones/ are 
compensating. 

’ (b) relative accuracy is more important to a statistician 
than absolute accuracy. 

(c) it is conventional accuracy of measurement that is 
generally looked for. 

(3) In what way does a statistical error differ from a mistake? 
What classes of errors are there, and ■ how may they be 
measured ? 

(B. Com. Alld, 1943). 

(4) What precautions should be taken in making use' of pub- 
lished statistics for further investigation. 

(B. Com, Agra, 1939). 

(5) (a) Discuss the main sources of errors in statistics and 
their effects. 



60 


statistics: theory and practice 


(6) State the important method* of approximation and 
their utility in statistics. 

(B. Com., Agra, 1940). 

(6) ‘ Let us have quantity as well as quality in statistical data; 
but if there be a choice between them, the latter is more important 
and essential than the former ’ — Explain. 

Is the above a good maxim for one editing the collected data? 

(7) What do you understand by editing of data? 

What will you do if the data before you are 

(f) incomplete but representative, 

(ii) incomplete and unrepresentative, 

* (Hi) certainly wrong, 

(ir) probably wrong? 

(8) What considerations would weigh with you, while edit- 
ing data, in regard to their accuracy, errors and approximations? 

(9) ‘ It is never safe to take published statistics at their face 
value, without knowing their meaning and limitations.’ — Bowley. 

Elucidate the above statement and point out the general rules 
that you would lay down for making use of published data. 

(10) What are the drawbacks of Secondary data? What 
precautions will you take in using them? Why are they used in 
spite of their drawbacks? 

(11) Write a note on the necessity of editing primary and 
secondary data before analyzing them. 



CHAPTER Vn 


STATISTICAL MATERIAL IN INDIA 

The chief sources of secondary statistical data available 
in India are the periodic reports and publications of (1) 
Central & Provincial Governments, Indian States, Districts 
and Municipal Boards, (2) committees and commissions 
appointed by the government (3) semi-government institutions 
like the railways or the universities (4) Research agencies. (5) 
Trade associations and private organizations, (6) technical 
periodicals. A list of some of the important official and non- 
official publications is given in Appendix 11. Non-offi'cial 
publications do not necessarily deal with non-official statistics. 
Rather, they mostly rely on official data. Non»«ffit'ial statistics 
obtaining in the country are meagre. They consist of 
statistical compilations of certain trade associations and other 
institutions which, in many eases, arj? regarded as highly 
confidential and, in several others, do not see the light of the 
day. They also consist of a few village and marketing 
surveys and other statistical enquiries organized by some 
universities and other bodies in India in recent years, and by 
a handful of individuals in the past. So many of them remain 
unpublished. All these studies, no doubt, indicate the lines 
along which much useful work can be done, but private 
individuals and institutions do not generally possess such 
facilities for collecting reliable informations as the government 
departments do, and the latter alone are usually able to meet 
the cost of publication. Therefore, the major bulk of the 
statistical material available in India comprizes of official 
statistics, the collection of some of which — particularly of 
those relating to prices, land-values and cultivation costs— 
dates back to the early nineteenth century. Great care and 

61 





S'rATimcs: theory and practice 


caution must, undoubtedly, be exercised in iisin^- non-official 
statistics; but even the official statistics are not suitable for 
scientific purposes without a searchinpr enquiry, since they are 
the bye-products of administrative machine. And, the conflict 
between statistical needs and administrative purposes is very 
well-known. 

Shoxi^miogs of Official Statistics. 

There are two glaring defects of Indian Official Statistics. 
In the first place, Cke agency for the collection of primary data is 
hardly trustworthy. For instance, collection of agricultural 
statistics, prices and wages is done by the least qualified men 
in the Revenue and Police Departments of the provincial 
governments. In the second place, scientific methods are very 
rarely applied to the analysis of primary data. For these two 
defects, the accuracy of official statistics has been questioned 
in this country. 

Inadequacy of statistical data in India is proierbial. The Indian 
Economic Enquiry Conuriittee, 1925, examined the material 
then available for estimating the economic condition of various 
classes of people in India and concluded as follows: 

For the purpose of determining in what respects the 
statistical data available are deficient from economic point of 
view, the subject may be considered under the following three 
main classes: 

(1) General statistics other than production comprising 

Finance, Population, Trade, Transport and Com- 
munications, Education, Vital Statistics and 
Migration. 

(2) Statistics of Production, including Agriculture, 

Pasture and Dairy-Fanning, Forest, Fisheries, 
Minerals, Large-Scale Industries, Cottage and 
Small-Scale Industries. 

(3) Estimates of Income, Wealth, etc.: Income, Wealth, 

Cost of Living, Indebtedness, Wages and I^rices. 



STATISTICAI, MATERIAL IX INDIA 


63 


The statistics falling under Class (1) are more or less 
complete; those under (2) are satisfactory in some respects but 
incomplete or totally wanting in others ; while as regards 
estimates of income, wealth, etc. under Class (3) no satisfactory 
attempt has been made in British India to collect the nr^cessary 
material on a comprehensive scale/ ^ 

The language of the BoWley-Robertson Committee with 
regard to the nature of statistical data available in India 
cannot be improved. They wrote in 1934: The statistics 
of India have largely originated as a by-product of adminis- 
trative activities, such as the collection of land revenue, or 
from the need of information relating to emergencies, such as 
famines. Only in the case of the population census and to 
some extent of foreign trade has there been an organisation 
whose primary duty is the collection of information. As a 
result the statistics are unco-ordinated and issued in various 
forms by separate departments. Though in some branches care- 
ful work is being done and determined efforts made to improA’^e 
the accuracy and scope of information, in others they are un- 
necessarily diffuse, gravely inexact, incomplete or misleading 
while in important fields general information is almost com- 
pletely absent. The only co-ordinated general publication is 
the Statistical Abstract, which omits some important statistics 
which must be searched for in other documents. The situation 
cries out for overhaul under the control of a well-qualified 
statistician 

Indeed, from the multifarious activities of state in India 
there springs a constant and copious stream of numerical data 
at regular intervals, fn^nsistency and incompleteness in existing 
statistics are natural in a country where the task of collection 
and presentation of offiv^ial statistics devolves on the different 
departments of the Government of India, provincial govern- 
ment and Indian States with their own admiiiistrath^e needs 
and personnel. Co-ordination of official statistics is not possible in 
the absence of a co-ordinating authority existing in the country. 



64 


srrATisTics: and practice 


Most foreign countries pots^ess Central Statistical Bureaux for 
collecting and editing all statistical ntatter of public interest. 
The establishment of such a bureaux was recommended by the 
Indian Economic Enquiry Committee and that of a Permanent 
Economic Stalf by the Bowley-Robertson Committee. Neither 
of these recommendations has yet been fully carried out. The 
present statistical organisation of the (rovernment of India 
consists of a Director-tleneral of Commercial Intelligence and 
Statistics at Calcutta, responsible for collection of some official 
statistics, and an Economic Adviser to the Government of 
India with the Statistical Research. Branch under him in Delhi 
to undertake interpretation of statistics. 

Yet another short-coming of official statistics is that the 
exxict ^griificance. scope and medmd of their compikuion are not 
widely known, so that they are not self-explanatory. Delay 
in publication of even the inadequate and defective data simply 
adds insult to injin^. Figures become completely out of date 
by the time they are published and thus much of their useful- 
ness is lost. The Bowley-Robertson Committee have made 
some valuable suggestions for improving the official statistica 
' of India and measuring the National Income of the country. 
A summary of their recommendations for measuring National 
Income is given in Appendix III. These recommendations 
were considered for long by the Government of India. The 
scheme of an economic census and the establishment of an 
economic staff were regarded as too costly and abandoned for 
the time being. 

Examixiatiodi of some oflkial statiilttics. 

It would now be well to examine in detail some of the 
important official statistics with a view to study their short- 
comings and offer suggestions for remedying them. 

Statistical Abstract of Btitiah India. In thU publication all 
important statistics, including among others, t^ose concerning 
finance, curi'ency, banking, population, industry, communi- 



STATISTICAL MATERIAL IN INDIA 65 

cations, labour and insurance, relating to British India, and 
where available, to Indian States are regularly published 
These data are based on the information furnished by different 
departments of the Central and Provincial Governments and 
Indian States. The Abstract, therefore, contains all those 
errors from which the original compilations suffer. The short- 
comings of this document are : 

(1) Lack of adequate co-ordination, though some co- 

ordination is being attempted. 

(2) Lack of completeness and consistency in existing 

figures. 

(3) Delay in publication. 

It is suggested that a suitable machinery for proper co- 
ordination such as that proposed by the Bowley-Robertson 
Committee should be set up. The committee recommended the 
establishment of a permanent economic staff consisting of two 
trained economists, one statistician and a Director of Statistics. 
The Director’s duties include co-ordination of central and 
provincial statistics. For the second short-coming, Bowley- 
Robertson Committee have made valuable suggestions, which 
only need implementing, information about some important 
problems which so far is not available in the Abstract should 
be included. Delay in publication is partly due to the inclusion 
of less important items which hold up the publication. If 
unimportant items are deleted, the publication would not 
only become handy and less expensive, but would also be 
published without unnecessary delay. It has been argued that 
if the Abstract is divided into different parts and each part is 
published as soon as information relating to any section is 
available, the defect of delay shall be greatly reduced. But in 
view of the separate publications dealing with different sub- 
jects, such as Statistical tables relating to Banks and to 
Progress of Co-operative movement 5n India, already existing, 
this argument may be met if the publication of these special 
reports is speeded up. It may, however, be suggested that 

F.— 5 



66 


statistics: theory' and pitACriCE 


like the procedure adopted in the Year Book of the League of 
Nations estimates, in place of finally revised fig:ures where the 
latter involve undue delay, may be published, the fact being 
mentioned in a footnote. Thus, delay in the publication of 
the Abstract would be very much reduced, and its usefulness 
increased. 

. Agricultural Statistics. These statistics deal largely with 
the land utilized for agricultural purposes and the crop raised 
on it. Since land revenue has been the one important source 
of revenue in India, we possess valuable statistical records 
relating to land, crops and yields since so early times us the 
famous settlement of Todar Mall, the Revenue Minister of 
Akbar. The British administrators of India collected agri- 
cultural statistics at an early stage, particularly^ when they 
introduced the Ryotuxiri system towards the close of the 18th 
century. Provincial governments took up the compilation of 
agricultural statistics in 1866. In 1885 first crop forecast, 
relating to wheat, was made, followed in later years by fore- 
casts of cotton, oilseeds, rice, jute, groundnuts and sugarcane. 

Agricultural .statistics are published regularly by the 
Department of Commercial Intelligence and Statistics in the 
following annual publications: 

(1) Agricultural Statistics of India Vols. 1 & II. 

, (2) Estimates of Area & Yield of Principal Crops in 

India. 

(3) Plantation (Tea, Coffee and Rubber) Statistics. 

(4) Summary Tables of Agricultural Statistics. 

In addition to the above, Crop Forecasts and Intermediate 
Crop-Forecasts are periodically issued, while ‘ Report on the 
Census of Live.stock, ploughs and carts in India ^ is a 
quinquennial publication. 

The difficult task of making primary estimates relating to 
agricultural statistics in India falls on the officers of the 
Provincial Revenue Departments, who have to carry it out 
ahiidst their heavy administrative and revenue-colleetmg 



STATISTICAL MATERIAL IN INDIA 


67 


duties. It needs no emphasis that they have neither the \ime 
nor the necessary training for the work entrusted to them: 
Consequently the reliability of Indian crop statistios is of a 
doubtful character. . 

In areas where ryotwari system or Temporary Settle- 
ment prevails all villages have been carefully surveyed 
and mapped. There, the village accountant, called ‘ Karttam * 
or Patwari keeps field records. At the beginning of the 
sowing i>eriod he prepares a statement showing areas 
under different crops in his village, and submits it to th^ 
revenue inspector. These figures are aggregated in Tehsils or 
taluks, districts and provinces. Once or more during the 
growing period, and finally at harvesting, the village 
accountant estimates the yield of the crops as so many annas, 
generally taking 12 to 16 annas as standard. The Tehsildar, 
exercising his general knowledge of the condition of crops, 
reports a single result for all the villages under his jurisdiction 
to the District Officer. The latter modifies these figures in the 
light of his knowledge or discretion and reports a single 
number of annas, or an average, for the district. This 
* average ’ is very vaguely defined and leads to suspicion! 
Usually it is the ‘ mode.’ Further, it has been found that in 
many oases the local official does not visit the fields but deduces 
the area from the quantity of seed the cultivator says he has 
sown. This malady is further aggravated by the indifference 
of the revenue officers towards checking the Patwari' s figures 
personally. With regard to the annawari estimate of the yield, 
the estimate is vitiated in some cases by the failure of the 
revenue officers to actually get^ crops of a small area from an 
average field cut and compared with the standard yield. 
Again, since remission of land revenue has to be granted in 
temporarily settled areas w^here the seasonal condition falls 
below a certain percentage of norma), it is not impossible that 
the village aoepuntant and subordinate officers may pitch their 
anmukiri estimate too high when the seasonal condition is on 



68 


STATICTICS: THFX)RY AND PRACTICE 


the border line for the grant of i emissions. The villafc^ 
accountant is said to possess a bias 1o report no change from 
the previous year, to underestimate a g^)od crop or to 
exaggerate the fall in a bad crop. The margin of error in his 
estimates is an unknown quantity. Efficient supervision, due 
criticism and projier scrutiny of the Patwari’s estimate of 
both area and yield arc essential if reliable and useful data 
are to be collected. 

The Department of Agriculture in each province fixes the 
normal yield per acre for the different crops in each district 
on the results of crop-cutting experiments conducted for each 
crop sown on plots of average quality. These experiments 
are too few, and the plots that are singled out for the experi- 
ments are selected according to ’ purposive selection \ in 
which personal bias has a great chance of prejudicing the 
choice. The normal yield, therefore, loses its repre^sentative 
character. The annatvari estimates of a particular year are 
compared with such defective normal yields. The comparison 
is vitiated. According to Bowley-Robertson Committee, if the 
direct method of estipiating the yield in maunds per bigha or 
bushels per acre is adopted, dependennee on the standard 
yield would be completely done away with and accuracy of 
data would improve. But, since the direct method is said to 
be impracticable, the yield should lie stated to the nearest 
anna for each village and weighted arithmetic mean of these 
statements should be obtained for a Tehsil oi* Taluk. This 
mean would be fairly reliable. The condition factor for each 
Tehsil or district could be expressetl as a percentage of the 
nonnal rather than as so many annas. It is further suggested 
that for the computation of a reliable noirnal yield a large 
number of plots should be selected on ' random sample ^ basis 4 
It is worth noting that improvements in this direction are 
taking place in the U. P. and a few other provinces. 

In permanently settled areas such as those in Bengal and 
Bihar there are no village accountants and subordinate 



SlATlSTICAr. MATERIAL IN' INDIA 


69 


Revenue officials, nor are the villages surveyed and mapped. 
There, the Revenue Officers have to depend for estimates of 
area and yield on the guesses of the village headman supplied 
to the former through police officers. The Revenue officers 
are required to check these figures from personal observation 
during their tours. They have not the time to do it always. 
These guesses are mostly under-estimates. It is suggested that 
the system of printed forms, in vogue for collecting figures 
relating to area under jute, should be adopted for other crops 
in these tracts. 

It may be pointed out that statistics of area under different 
crops are fairly trustworthy in temporarily settled areas, since 
on their accuracy depends the collection of land-revenue. But 
similar figures for permanently settled areas are far from 
being satisfactory, since they are not required for revenue 
purposes. Agricultural statistics available at present in India, 
because of their short-comings, are quite insufficient to deter- 
mine whether food is increasing in proportion to population. 
It is not possible to dediu^e from them the quantity or value 
of total agricultural produce. This presents a serious handi- 
cap for economists and statesmen to tackle food problem in the 
country. Even the yield figures are not sufficiently reliable 
and areas for minor or mixed crops are not separately known. 

The Director General of Commercial Intelligence and 
Statistics, who publishes crop-forecasts two or three times for 
the different commodities mentioned above in their respective 
seasons, bases them on the information primarily supplied by 
the village accountant. If Fatumns bias can be removed and 
his work properly supervised, the accuracy of his data, and 
with it that of crop forecasts, will improve. Then the useful- 
ness of forecasts to commercial community would certainly 
increase a good deal. Besides, crop forecasts for other com- 
modities like jowar, bajra, maize should also be made. In 
order that reporting may improve, the Agricultural Depart- 



70 


STATOTICS: THEORY AXD PRACTICE 


m^nts should be given an increasing share in making general 
estimates of yield. 

A quinquennial census for livestock is taken in different 
provinces. But the information relating to animal products — 
milk, meat, eggs» hides, bones etc. — is very little. A detailed 
knowledge is desirable. Further, the classification of cattle 
should be amplified to furnish greater details. 

Prices, Wages and Cost of Living. Market prices of staple 
commodities for smaller towns and wages for some grades of 
labourers are reported regularly in the (lazette of India and 
Provincial Gazettes. The revenue officials are required to 
report prices. Overburdened with their administrative duties 
they hardly have sufficient time to collect price quotations. 
Adequate attention is not paid to exact description of the 
grade, and distinction between wholesale prices for small lots 
and retail prices is not made clear. Sometimes prices of the 
same commodity but of different qualities, and at other times 
even when actual prices have changed the same, old, prices are 
quoted. In order that correct information may be had. dealers 
in staple commodities should be persuaded to send regular 
quotations of the same commodity of the same quality. These 
quotations may be verified by proper supervision. 

Wholesale prices of certain agricultural and non-agri- 
cultural articles for a few bigger towns of India are available 
in a Monthly Statement issued by the Department of Commer- 
cial Intelligence and Statistics. Prices for the past few 
months are also shown along with current prices so that com- 
parison can be easily made. But, when at times, continuity in 
monthly quotations is broken the benefit of comparability is 
lost. 

The General Index Number of All-India Wholesale Prices 
published until recently had outlived its utility, since its base 
year was so old as 1873, the list of^comriiodities had not been 
revised since 1889 and it was unweighted. Its publication has 
been discontinued since August 1941. ' Index Numbers of 



STATISTICAL MATERIAL IX IXDIA 


71 


weekly wholesale prices of certain articles in India ^ with week 
ending 19th August 1939 as base are being regularly published 
now in the ‘ Monthly Survey of Business ('onditions in India ^ 
issued by the office of the Economic Adviser, (Toverninent of 
India. These index numbers are based on only 23 commodities 
and are unweighted. (Tcometric mean is used in their computa- 
tion. Against them stand the Calcutta and Bombay Wholesale 
Price Index Numbers, which are based on much larger number 
of items and are weighted. Naturally, these Index Numbers, 
all representing wholesale prices in India, register a huge 
difference for the same month. The base of Calcutta ajid 
Bombay wholesale price index numbeis is July 1914, w^hieh 
should better be changed now. In addition to these index 
numbers, wholesale price index numbers for Madras k Cawn- 
pore are also available in the ‘ monthly survey \ 

Wages Statistics can be classed into three categories — (1) 
Factories and Mines, (2) other Urban Occupations and (3) 
Rural Occupations. The task of collecting statistics for the 
first two categories should be entnusted to Provincial Labour 
OfiS'cers who may be appointed where they do not exist, 
Bombay’s example is commendable in this regard. Regarding 
wages of urban occupations scanty attention has been paid 
to those working outside the factories in towns, e.g., municipal 
employees, artisans, porters, builders. The’ range of occupa- 
tions included is not comprehensive for towns. Wages for 
rural occupations are often quoted between wide range and 
the frequency of employment is not indicated. Ulassification 
of urban and rural workers is inadequate. Therefore, an idea 
of general movement of rural wages is difficult to obtain. It 
may be suggested that in each district a small number of 
villages, where wages are paid in cash and separate occupa- 
tions are few, should be selected. Wages paid for the different 
occupations should be collected, care being taken that in each 
successive record the wages are paid for the same w'ork and are 
strictly comparable. The unweighted average of the rates 



72 


STATISTICS! THKORV AND PUACIICR 


for each occupation would afford a fair ineasurenuuU of the 
general movement of rural waf?es in a province. Wa;»e rates 
should not be quoted as varyiiif,^ between two limits, and fre- 
quency of employment should, so far as j>t>ssible, be ascertaine<l. 

Cost of Living Index numbers are now available for 27 
different towns of India. They are pulilished in the Monthly 
Survey of Business Conditions in India l>ut not in the Statistical 
Abstract. This omission should be rectified. Besides cost of 
living index numbers should be computed for other labour 
areas where wage-payments are made on a cash basis, so that 
public opinion may be kept well informed particularly when a 
wage dispute turns on the expense of living. Further, a 
separate index number for salaried persons, say, under Ks. 100 
a month, is worth attempting. 

TradiC Statistics. Statistics of foreign trade are avail{(1)le 
in the monthly and annual Accounts and Statement of the 
‘ Sea Borne Trade and Navigation of British India ’ together 
with the ‘ Annual Review ox the Trade of India \ published by 
the Department of Commercial Intelligence and Statistics. 
They generally give the information that is practicable re- 
garding the exports and imports of Foreign Merchandise, 
exports of Indian merchandise, and total Exports under five 
main classes. These classes arc: (1) Food, Drink and Tol)a(^co, 
(2) Raw materials and produce and articles mainly unmanu- 
factured, (3) Articles wholly or mainly manufactured, (4) 
Living animals, (5) Postal Articles. But since imports and 
exports on Government Account are not given in as much 
detail as those on Private Account, it is not possible t.> arrive 
at the total trade in particular commodities. This short- 
coming should be made up. 

Statistics relating to inland trade are now available in 
' Accounts Relating to the Inland fRail and River borne) 
trade of India’ issued by the Department of Commercial 
Intelligence and Statistics. Similar statistics were published 
by the Department of Statistics upon 1922. The preseiit series 



STATISTICAL MATERIAL IN INDIA 


73 


retains in essentials the form of the older publication and is a 
monthly production. The trade dealt with in the publication 
falls into one or other of the following categories: — 

(i) Trade of a province with other provinces, 

(ii) Trade of a chief port or other ports with the 

province in which such port or ports are situated, 
and 

(iii) The trade of a chief port or other ports with other 
provinces and ports. 

In addition to the above publication ' Accounts relating to 
the Coasting Trade and Navigation cf British India ' Trade 
Statistics relating to the Maritime States in Kathiawar and the 
State of Travancore * are also monthly published by the 
Commercial Intelligence & Statistics Department. 

Thus a good account of India’s trade statistics is available. 

The Census Reports. Census of population is generally 
taken every ten years in India. Sometime before the date fixed 
for the Census a Census Commissioner is appointed, who selects 
Superintendents of Census for each province and native state. 
The Superintendents select honorary Census Supervisors 
and Enumerators for each district or locality. Enumeration is 
done through the municipality in a town and through the 
Tehsildar in a rural area. Indian Census is unpaid. To take 
the preliminary census the enumerator visits every house in 
his block sometime before the census day, and collects the 
required information from the head of each family. Then, 
on the night fixed for the census there is a simuUauei>us 
country-wide count. The collected data are scrutinized, 
classified and tabulated to produce the census reporti^. These 
reports give valuable information relating to the di.sti ibutiou 
and density of population, vital statistics, urban and rural 
population, age and sex distribution, civil condition, infirmities, 
occupation, literacy, languages, etc. 

The last census was taken on 1st March 1941. It puts the 
estimate of population at millions, of which is rural 



74 


statistics: theorv and practice 


and 13% urban. The one-night enumeration which was 
adopted upto the census of 1931 was in 1941 replaced by a 
period system. It enabled the number of enumerators to be 
halved as against the number employed in 1931. Schedules 
used formerly were abandoned. Instead, enumeration was 
carried out directly on the slips, which were later sorted to 
produce tables. A new feature was the taking of 1/50 random 
samples of the entire population which would be used for 
making several deductions. Religion as criterion of census 
differentiation had several drawbacks and was substituted by 
the concept of community. And some such new questions as 
the age of mothers at the birth of their first child and the 
number of children born were introduced. Returns for this 
question would help the computation of net reproduction rate 
for the country. 

The short-comings of the census leports are many. There 
is a marked lack of uniformity in the classification of oc^jiipa- 
tions from census to census. A study of occupational structure 
of the country is, therefore, rendered difficult. Further, the 
distribution of industrial workers into employees and those 
working on their own account is not available. Nor are. the 
parts of the year for which a worker is employed and the parts 
for which he is unemployed known. Indian age-returns are 
admittedly inaccurate mainly because of the ignorance of 
precise age. Besides ignorance, there are some psychological 
reasons too. The age-period of girls from 10 years to 15 years 
is defective iji numbers because of the unwillingness of some 
people to admit having unmarried daughters who, according 
to custom in the community or religious injunction, should 
have been married by then. For widowers and bachelors, 
particularly if they have a wish to remarry, there is a 
tendency to under-estimate their ages. Recently married girls 
and particularly those who have become mothers tend to over- 
estimate their ages. The old people have an inclination to 
overstate their ages. Again, there is a marked preference for 



STATISTICAL MATERIAL IN INDIA 


75 


stating the age at a digit ending with zero or five. Enn- 
merators are instructed to correct ridiculous returns of age. 
Jf they are conscientious, they do it by asking the person 
concerned questions about his age at the time when some well- 
knowm event in the past occurred. But with the limited 
ability of the Enumerators it is highly doubtful that they are 
in all cases competent to detect the wrong and verify it. 
Indian custom permits only the female investigator ti? verify 
the information about pardanashin ladies. Such invesiigators 
should be appointed. 

Returns for civil condition in 1931 exhibited an excess’ of 
married males over married females, whereas in the previous 
censuses the ratio was reverse. The reason foi* this was the 
promulgation of the Child Marriage Restraint Act m 1^0. 
because of which those people who had married their male 
children under 18 years and female children under 14 years 
may have hesitated to disclose the truth for fear of prosecution. 
Infirmity is very much concealed, particularly among females. 
Deafness of children is, in many cases, kept a secret. Insanity 
and blindness are generally matters of personal temperament. 
Leprosy is infectious at an early stage when it can not be 
detected by the lay eye of the enumerator. 

Accuracy of census returns depends to a large extent on 
the capability of the enumerator and also on the general 
circumstances prevailing in the country at the time of the 
census. The census enumerator constitutes the front-line 
force, and if he is not given adequate training he cannot be 
expected to hit the mark wdth precision in all cases. It is n.) 
wonder if the blank forms used in 1941 could not be filled in 
correctly by many an enumerator, since the forms were not 
simple and the enumerators were not given adequate training. 
Training of the enumerator is, no doubt, necessary; but, along 
with it supervisors too must also be picked up from amoiig 
those possessing a knowledge of statistical methods and ways 
of conducting demographic enquiries. 



76 statistics: theory and practice 

Circumstances obtaining in the country at the time of 
census have a great part to play. The effect of Sarda Act has 
already been pointed out. Further, when the distribution of 
seats in the legislatures, local bodies and government services 
is based on the relative strength of the persons belonging to 
different religions, it is not improbable that some people 
exaggerate the number of members in their family — something 
which an inefficient enumerator cannot always detect. This 
factoi* is believed by some people to have caused an over- 
CvStiniation of the population in 1941. Again, if secrecy of 
census returns is not ensured and people have the apprehension 
that the returns may be produced in a court of law as evidence 
of age or of a marriage against the provisions of the Sarda Act, 
wrong returns would naturally result. Secrecy should be 
ensured. The Census Act should be made permanent, and a 
permanent staff should be appointed at the centre to work up 
the material collected at each census. The present system of 
setting up the whole census machinery hastily before the 
cemsus and disbanding it after the publication is complete, 
whereby the experience gained at one census is not fully 
utilized at the other, should go. Essential information for 
each district should be made available in booklets. Tabulation 
should be done by machines. 

The above .suggestions are not meant to be exhaustive, 
but they do indicate the lines along which improvements can 
be effected in the Indian Census. Improvements are 
necessary because the census has great potentialities. Of 
course, the census is primarily meant for administrative 
purposes, })ut it also affords much valuable information for 
the economist, the sociologist and the businessman. The eco- 
nomist, for instance can study, on the basis of census figures, 
the population trend of the country, her occupational structure 
and the increment in urban population. Utilizing other 
relevant data along with these studies he can trace the 
correlation between population-growth and food-supply, 



STATISTICAL MATEUIAI. IN INDIA 


77 


between occupational changes and the effect of granting pro- 
tection to industries, and between increment in urban popula- 
tion and decay of rural crafts. The sociologist may study 
the possibilities of effecting reforms in r^pect of, say, ages 
at which people should marry, or arrangements that should be 
made to bring down infantile mortality. 

Businessmen do not always realize that the census reports 
contain information which is of an important nature for them. 
If they do, many of the piH)blems they are confronted with 
can be properly attended to. India has a very large volume 
of internal trade, and every human being returned in the 
census is a consumer. To the businessman a knowledge of his 
consumers and their location is evidently of immense aid. 
Again, with a knowledge of the density of population of 
different areas an estimate of areas where development of 
market is likely can be made. The higher the density of a 
certain locality greater is the market there. The selling cost 
will always be low since delivery service in a dense population 
is cheap. Further, knowing the number of inhabitants in a 
town and the quantity of goods the businessman had been 
usually selling there, if would be possible for him to compute 
the per capita consumption. And, if this per capita consump- 
tion shows a fall for no valid reasons other than lack of 
efficient push, the businessman can launch an intensive selling 
campaign to increase his sales. The class for which his goods 
are specially meant, say ladies, infants, military people, can 
be approached in a business-like manner. Besides, occupational 
statistics would tell the businessman whether a certain area 
is inhabited by the poor, whose purchasing power is low’. He 
should then see if his goods are meant for such a class. If not, 
he would be wasting money over trying to gain a foothold in 
that particular area. He would also be able to gauge the pre- 
sent supply of labour and the future labour supply on the basis 
of occupational statistics and make adjustments accordingly. 



78 


statistics: thbory anb practice 


. A transport agency, say a railway, would find valuabk^ 
information in the -census. The area which is densely popu- 
lated, or would be^so populated, if only the means of transport 
are improved op introduced, should receive the first attention 
of the transport authority. Such areas would also be gocni 
for advertising agencies^ to push their advertisement among 
the class of people to which their goods relate. Producers of 
staple commodities and manufacturers of industrial goods can 
equally benefit from the material collected at the time of 
census. If population of a town falls, demand in that area 
will fall unless the demand of the existing population rises 
proportionately. Changes in the sex-ratio, in occupational 
structure, v)r in age composition are likely to effect demand 
for goods. Similarly life Insurance companies may compare 
their estimates of expectation of life with those published in 
the census reports and see that their premium rates are pro- 
perly drawn up. The legislators shall be able to study the 
necessity of framing legislative provisions for removing the 
ills from which society suffers upon a study o^ infant Tm>rlality, 
fertility rate, sex-ratio, infirmities. Numerous other use.s nf 
census figures can be thought of. All that has been said so far 
indicates the immense value of the census to different people. 
It is imperative, therefore, to make the census up-to-date and 
as useful as possible. 

Vital Statistics. The vital statistics of India are admittted- 
ly defective. Figures published in the statistical abstract are 
definitely misleading. The system of registration of births 
and deaths varies in different provinces. Generally they are 
kept up by the reports of village officials in rural tracts and 
by municipalities in tlie urban areas. The reporting of births 
and deaths is an irksome duty which the village headman 
often neglects. In the ca.se of births he is very likely to wait 
and see whether the child would remain alive to save himself of 
the worry of making a second report of its death if death soon 
occurs. He hesitates to report deaths to avoid the unwelcome 



STATISXrCAI. MATfiEIAX IN IKIHA 


7 » 


visits of unduly suspicious police ofiRcers. He generally holds 
up the reports of births and deaths for a weekly or fortnightly 
visit to the Tehsil or taluk headquarters. • The records iu towns 
are said to be more imperfect than those in villages'. A clearer* 
appreciation of the population problem of- India would have 
been possible, if only vital statistics were accurate and Com- 
plete. The census reports of 1941 might be able to throw 
some light on the net-reproduction rate in India. The system 
of registration of births and deaths should be brought up to 
the level reached in other countries. It may be suggested that 
the list of persons to whom such daily occurrences may T)e 
reported should be widened, and an organisation should be 
established to deal with these day to day instances. 

EXERCISES 

(1) Explain with examples, the important sources of errors 
in the census returns. How can these errors be avoided ? 

(B. Com., Alld., \9SH)i 

a 

(2) In what respects are the statistical data, available in 
India, deficient from an economic point of view.^ How' can this 
deficiency be removed.^ 

(а) In the Census Report for 1931, the Census Commisskmer 
for India observes: — 

‘ The error in the numerical count has been put at a maxi- 
mum of one per mille and is probably less.' 

Comment upon this statement. 

(B. Com., Alld., 1935). 

(4<) Explain fully the method that should be employed in 
making an economic survey of any large town in India. 

(B. Cora., Alld., 1936). 

(5) Explain the main defects of the statistics of prices and 
wages in India. How can these defects be removed? 

(B. Com., Alld., 1938). 

(б) ‘ The statistical publications relating to the dec'ennial 
censuses of population in India leave little to be desired.' 

(Indian Economic Enquiry Committee Report). 


Comment. 



80 


statistics: th«oky and practice 


-(7) Discuss the method recommended by ’ the Bowley- 
Robertson Committee for the measurement of the National Income 
of India. 

(B. Com., Alld., 1940). 

(8) What methods are usually adopted for estimating the 
national dividend of a country.^ To what extent, in your opinion, 
are recent estimates of the ‘national dividend' of India reliable.^ 

(M.A., Alld., 1936). 

(9"^ Why is an economic survey of a country considered 
essential before adopting a programme of economic development.^ 
How would you conduct an economic survey of India? 

(M.A., Alid., 1936). 

(10) Describe briefly the nature and sources of the data used 
in the Review of the Trade of India. Are there any gaps and 
defects in this account? 

(M. Com., Luck., 1942). 

(11) Explain the important sources of biassed errors in the 
collection of data regarding wages, prices and yield of crops in 
India. How can these errors be avoided? 

(H. Com., Luck., 1938). 

(12) How would yor organize a Central Statistical Bureau 
for India? Explain clearly its main functions. 

(B. Com., Luck., 1938). 

(13) Give the Blank form of a census schedule used in India. 
What improvements would you suggest to the schedule? 

(B. Com., Luck., 1939). 

(14) Discuss briefly the methods of calculating National 
Income. How far are these methods available for calculating the 
national income of this country. 

(B. Cora., Bombay, 1936). 

(15) What statistical information is available in India with 
regard to (a) Imports and Exports, (6) Prices, and (c) Agricul- 
tural statistics? Examine their sufficiency. 

^ (B. Com., Alld., 1943). 

(16) Discuss the possible value of Census Reports to pro- 
ducers, manufacturers and businessmen. How cim the Indian 
Census Reports be made more useful to these people? 

(M. Com., Alld., lS48). 



STATISTIC AT. MATEUIAI. IN INDIA 


81 


(17) Wliat are tlie principal sources of statistical data for 
Dritisli India? Examine, suggesting improvements, tlie materials 
available under the following heads: (a) Agriculture, (6) Wages 
and (c'l Prices. 

(M. Com., Alld., 1948). 

(18) How' will you estimate the wealth of a country? Dis- 
cuss the problem of organizing a census of economic production 
in India. 

(B. Com., Agra, 1939). 

(19) Write a note on the inadequacy of statistical data in 
India for sociological and economic inquiries, suggesting methods 
for removing the inadequacy. 

(20) State what you can about Indian Vit?il Statistics. Do 
they throw any light on the causes of India’s poverty? 

or, Discuss the available sources of information in respect of 
India’s trade, both foreign and inland. 

(B. Com., Agra, 1940). 

(21) What are the present methods of collecting agricultural 
statistics of acreage and yield per acre? Discuss the accuracy 
of the methods folUmed. Can you suggest improvements special- 
ly for perinaiicntly settled areas? 

(M.A., Cal.. 1936). 

(22) Describe the present method of occupational classifica- 
tion followed in Indian censuses. Do you consider it satisfactory? 

(M.A., Cal.. 1936). 

(28) “ The question whether a given currency is over-valued 

or \mder-valued at the current rate of exchange bristles 

with difficulties.” (H. R. Committee report). 

State the statistical difficulties in the case of India. 

(24) If you are asked to compare the economic effect of the 
present war on India with that in Great Britain, what statistics 
would you use? 

(M. Com., Lucknow^ 1942). 

(25) Discuss the utility of the data regarding occupations 
collected at the time of the last’ census. How can these data be 
utilised in estimating the National Income of India? 

(M. Com., Lucknow, 1942). 


F .— 6 



82 


statistics: theory and practice 


(26) (^) Briefly enumerate the causes of high infantile 
mortality in India and suggest the steps to be taken to bring down 
the rate. 

(h) Examine the reliability and sufficiency of statistics 
relating to such infantile mortality in British India. 

(B. Com., Alld., 1912). 

(27) What do you understand by the net reproduction rate? 
How may this be utilized for estimating the future population of 
a country? Arc there any special difficulties in the case of India? 

(M.A., Alld., 1942). 

(28) What kind of information on social and economic sub- 
jects is ayailable in 

(a) The Monthly Surrey of Business Conditions in India, 
(b) Statistical Abstract of British India, (c) Review 
of IVade of India, (d) The Bombay Labour Gazette. 
(e) The Indian Census Reports and (/) the “Capital.” 

(2.9) Give tlie general method of preparing crop forecasts 
issued by the Department of Commercial Intelligence and Statis- 
tics in India. Suggest measures for improving their accuracy and 
usefulness. 

(80) What improvements, in your opinion, .should be made 
in the Stati.stical Abstract of British India to increase its general 
usefulness ? 

(81) “The statistics even of crop production leave much to 
be desired, while statistical informations about other important 
parts of agricultural income, such as the output of animal hus- 
bandry, are almost completely lacking, and statistics of industrial 
production are patchy in the extreme.” (B. R. ( ommittce Report). 

Prove the correctness of the above statement by taking ex- 
amples from Indian statisti(*s and suggest measures for removing 
the defects. 

(82) * Indian age-returns in the census are admittedly defec- 
tive.-— In support of this statement give the reasons for biassed 
and unbiassed errors in age-returns and stale tlie steps tliat were 
taken in the census of 1.941 to avoid deliberate (»v(T-t‘Stima1 ion 
and under-estimation of ages. 



CHAPTER Vni. 

CLASSIFICATION AND TABULATION OF DATA 


CLASSIFICATION 

Statistical data, collected in the course of an enquiry, 
concern a number of units of one kind or another which 
together form the f»Toup relating to ihe inquiry. A statistical 
group consists of a large number of things or individuals, 
having something in common, but differing from one another 
in respect of some measuralde characteristics. For example, 
students belonging to the same college may differ from one 
anotlier in regard to their age, civil condition or height. But, 
together they constitute a group. A statistical group is very 
large so that no one can appreciate at a glance, or even after a 
careful study, the information relating to many units. A 
reading of a thousand or more schedules returned by the 
students of a college respecting their age, weight, height etc. 
cannot enable the reader to get a proper idea of the details 
mentioned. Some process of condensation must be devised 
for the purpose. This process yields statistical tables. But 
before tables can be prepared the different units must be 
grouped togethei- into classes so that the like will go with the 
like and the unlike with the unlike. Details would necessarily 
be lost, since the individual units would be merged in a class. 
For instance, all those students who return themselves as 
' mai'ried ’ shall 1)0 placed in one class, the ^ unmarried ’ in 
another and the ' widowed ' in the third. From the table 
that shall then be prepared none shall be able to identify him- 
self, since he or she shall be merely one in a class composed 
of those similar to him in respect of civil condition. This 
process is called Classification, 

83 



84 statistics: tpieoky and pkactice 


‘‘ Classification is the process of arranging^ things (either 
actually or notionally) in groups or classes according to their 
resemblances and affinities, and gives expression to the unity 
of attributes that may subsist amongst a diversity of 
individuals 

The objects of classification are many. It clearly shows 
points of similarity and <lissimilarity. Jt heli)s one to form 
a mental picture of the olgeets he can see or conceive. I>y 
condensing the details it saves one from mental strain. It 
affords an appreciation of the information that would other- 
wise have been left out as perplexing or unimportant. It 
prepares the ground for enabling comparisons and inferences. 
It institutes a logical and orderly arrangement of things. 

Importance of classification in .Statistics cannot be over- 
emphasized, and yet it is something foi* which no very precise 
rules can be laid down. Skill and patience are, no doubt, 
indispensable; but as in collection of data so in its classifica- 
tion. experience alone will convince one of the rccpiisite care 
if blunders are to be avoided and time saved. It may i)e noted 
that an ideal classification should possess the merits of l)eing 
unambiguous, stable and flexible. It should not leave room 
for doubt; it should be stable enough to render comparisons 
easy, and it should be so Ilexible as to incorporate new ideas 
as they materialize in future. 

Classificatiou is determined by the characteristics possessed 
by the individual units of a group. These characteristics are 
of two kinds: descriptive and numerical. Descriptive charac- 
teristics comprize of attributes or (jualities, possessed by 
objects or individuals, such qualities not being quantitatively 
measurable. Characteristics like sex, civil condition, caste, 
religion and infirmity are descriptive. Numerical characteris- 
tics are so called becamse they are susceptible of quantitative 
measurement. Age, height, income, weight are numerical 


*L. R. Connor, Statistics in Theory and Practice, 1938 ed., p. 18. 



CI.ASSIFICATION AND TABU1.ATI0N OF DATA 


85 


characteristics. Classification of a given data by descriptive 
characteristics is generally called classification according to 
attributes^ while that by numerical characteristics is commonly 
known as classification according to class-interval. 

Classification according to Attributes. 

Descriptive characteristics can be classified by means of 
some natural or physical lines of demarcation. Natural or 
physical differences determine the classes into which units 
should be placed. It is easy in those cases to separate the 
similar from the dissimilar characteristics. For instance, 
population of India may be classed into male and female, 
literate and illiterate, blind and not blind. Thus, when one 
attribute is noticed, two distinct classes are formed. These 
two classes are exclusive of each other. If the members of one 
class possess the comiiion (piality of being males those of the 
other are devoid of it. A classification of this type, where 
each class is divided into two sub-classes only, is called 
Simple Classification. 

Where more than one attribute is studied several classes 
may result. For instance, the population of India may not 
only be classed into males and females, but males and females 
may be further sub-divided into literate and illiterate such as 
male literate and female literate, male illiterate and female 
illiterate. Classification may be carried on still further, for 
instance, accoi’ding to occupation. A male literate may be 
a teacher, a female literate a stenographer ; a male illiterate 
may be a peon, a female illiterate may be a maid servant. 
Further classification according to religion or caste is yet 
possible. Numerous classes may thus be formed. A classi- 
fication of this type, where each class is divided into more 
than two sub-classes, is called Manifold Clasaificatioa. 



86 


statistics: theory and practice 


The following brief classification of languages of India 
atfords a good example of manifold classification : 

A. Languages of India And Burma, 

(i) Austric (ii) Tibeto-Chinese (iii) Dravidian (iv) 
Indo-European (v) Unclassed. 

B. Languages of other Asiatic Countries and Africa, 

(i) Indo-European (ii) Tibeto-Chinese (iii) Semitic 
(iv) Haniitic (v) others. 

C. Languages of Europe. 

(i) Indo-European (ii) others. 

At the risk of repetition, it is necessary to state that in 
classification according to attributes the boundarv line between 
different classes, though artificially set, is definitely made 
before the work of classification begins. For instance, the 
decision as to who would be recorded as literate and who 
illiterate is made before actual classification. 

Classification according to Class-Intervals. 

Numerical characteristics can also be classified by 
assigning arbitrary limits. The ages of persons, fvir instance, 
are of indefinite variety, so also are lieights, or weights. Hut, 
the entire range of ages, heights or w'cights. from th(‘ lowest 
to the highest, can be broken up by drawing arbitrary bvuindary 
lines, and those units which are nearly alike in respect of a 
particular character are put together in one class. Thus if the 
ages of a given group of people vary from 25 to 44 years and it 
is desired to divide the group into four classes, the boundary 
lines w’ould preferably be fixed on numbers 25, 30, 35, 40 and 45 
years. These boundary lines are knowm as the class-limits, the 
group constituted by two limits as the dass-interval, the dis- 
tance between two limits of a class-interval as its magnitude, and 
the number of observations falling within a particular class- 
interval as its frequency. In our example, we group together 
those whose ages are 25 years and more but less than 30 years 



CLASSIFICATION AND TABUIATION OF DATA 87 


and place their number in the class-interval 25-30 years, group 
those who are 30 years and more but less than 35 years and 
place their number in the class-interval 30-35 years, and so on. 
Evidently, the magnitude of our class-interval is 5 years. 
Unit is the most common magnitude. So far as possible, the 
magnitude of all class intervals should be uniform, so that the 
labour of calculating different statistical constants may be 
minimised. Even if a particular class-interval contains no 
frequency, the class-interval must be entered in its proper 
place, otherwise errors might be made in plotting the results. 
The limits of the class-intei vals should preferably be so fixed 
that the mid-point of each falls on an even unit and not on a 
fraction. One might ask — ‘ How many groups should there 
heV In answer it may be said that the number of class- 
intervals is dependent on the ])ature of the inquiry. In general 
a number of groups in the neighbourhood of 20 is the most 
satisfactory, provided the number of observations is reasonably 
large. Thus classification according to class-intervals is 
obtained when a numerical characteristic is considered and 
each group is subdivided into a number of classes or groups, 
rather arbitrarily. Table 1 stands as an illustration. 

The a])ove class-intervals, viz. 25-30 years, 30-35 years etc., 
are expressed according to exclusive method^ that is, the 
upper limit of the one class-interval is the lower limit of the 
succeeding class-interval. An item 29.99 years would fall in 
the class-intei’val 25-30 years, while an item exactly 30 years 
would be taken to the second class-interval (30-35 years). This 
diffi'culty can be won over by classifying the class-interval as 
^ 25 and under 30 years * 30 and under 35 yeai*s \ and so on. 

Ulass-intervals are also expressed by the inclusive method. 
The above class-intervals arranged according to the exclusive 
method would be expressed as 25-29 years, 30-34 yeai*s etc. 
according to the inclusive method. In this case the upper 
limit of the one class-interval is also included in the class- 
•interval itself. The first class would include all items between 



88 


statistics: TIIKOKV AXD PUAClKh 


24.5 and 29.5 years. To he still more unamhii^umis, the class 
intervals may he expresse«{ as 2o-29.9 yea?-s, d0-)t.9 years. 
But the inclusive method is not in i-feneral use since the i<h‘a 
of continuity in the liniits oi class-i/itervals is lost. 

Statistical Series. 

If the qwiniitics or vnincs of some n^^vc^^aie nro incasurcil. 
counted or weighed, or nunihers in some i^iroup or class are 
counted, and they are placed one after another, the result is 
a statistical series. Briefly a statistical series may ))c defined 
as things or their attributes arranged according to some logical 
and systematic order. 

Time, Spatial and Condition Series. 

Statistical series may he distinguished, according to three 
bases of classification of data, as (1) historical or time (2) spatial 
and (2) cindition. In the first, facts are ari’angcd willi respect 
to time, for instance, index numbers of wholesale pric(‘s of 
wheat in India over a period of time detailed in chrt)nological 
order. In the secoml, the controlling factor in presentation is 
place: variations are note<l geographically. Pnxluctiou of 
wheat in India for a given date arranged accoi'diiig to different 
provinces would constitute a spatial series. In the third, 
variations in size and amount (d* things or their atti-ihutes ai‘e 
shown. The different measurements of natural phenomena are 
usually distributed about a norm. If heights, weiglits, or ages 
of students, or lengths of a number of leaves choseii at landom 
are measured, the different measurements, when arranged in 
logical order, shall eonstitute a condition series, ami it will bo 
noticed that though the different measurements would vary, a 
most common, predominant weight, height, age or length of 
leaves would be found. We shall see later (in Chapter X) tliat 
this ^ most common ’ length is called the ‘ mode ’ of the series. 
Condition series take the form of w^hat are called ‘ frequency 
tables ^ e.g. table 1. 



CI.ASSIF1CATION AND TABUI.AT10N OF DATA 89 


Continuous and Discrete Series. 

Series may be continuous or discrete. When the items of 
a series are not capal)le of being determined with mathemati- 
ca) accuracy, but are always measured by approximation and 
can only be placed within certain limits, the resultant record 
is a continuouis series. Table 1 serves as an example. On the 
contrnry, where the items are exactly measurable and their 
record shows definite breaks between one value and the other 
sucec(^ding it, the resultant record is a discrete or liroken or 
discontinuous series. For illustration see Table 5. 

Measurements of weights, magnitude and volume consti- 
tute continuous series for apparently there is no limit to the 
sub-division of maunds, miles and gallons. The number of 
labourers in factories, of spots in dice-throw or of pages in a 
book form discrete series, since they must give integral 
numbers and are incapable of sub-division. 

TABULATION 

After the data have been classified they may be tabulated, 
that is, put into taliular form. Tabulation stands for the 
systematic and scientific presentation of quantitative data in 
such a form as to elucidate the problem under consideration. 
Its function is to arrange in an orderly manner the answers to 
those (|uestions with which the inquiry is concerned. Tables 
are intended to summarize the information obtained in course 
of an investigation. 

Rules and Precautions for Tabulation. 

Some precautions in drawing up tables are necessary. The 
given data may be grouped in one table or several tables. A 
single table shall, no doubt, bring the entire data into proxi- 
mity; but if it is too large, it shall confuse the eye and lead 
to great difficulty in following the columns and rows at a 
glance. To do away with such inconvenience it may be 
broken up into several separate tables. Further, several com- 



90 


statistics: theory and practice 


parisons of different nature should not be jumbled up in one 
table. Each table should be a unit. Usually there should be 
separate tables for different distinct purposes. Af<ain, there 
should be few main divisions with several sub-headings under 
each. If the number of headings is very large, the main facts to 
be compared may not be adequately emphasized. Of course, the 
exact number of divisions and sub-headings shall be deter- 
mined by the data in hand. Each table should be so complete 
in itself that it may not be made more intelligible by re-draft- 
ing. The table should suit the size of the paper on which it 
is drawn. So, the width of each column and row should be 
properly calculated and headings correctly arranged before 
the permanent table is ruled or figures are entei-ed in it. 
Totals, averages, percentages and the numbers that are to be 
compared should be placed ck)se together, and, if possible, they 
should be placed in the same vertical column rathei* thaii the 
same horizontal row. Columns that are to )>e compared 
should be placed adjacent to one another. The rulings in 
tables should be such that principal groups are separated by 
thick or multiple-ruled lines. Unimportant data may be 
grouped together and placed in ‘ miscellaneous ^ group. Items 
w^hich are in any way different from the rest of the items — 
e.g., estimated figures, revised figures — should be marked with 
an asterisk or number, and an explanatory note given beneath 
the table. The table should be given a suitable title. This 
title and the title of sub-headings should be self-explanatory, so 
that no reference in the text or footnotes for the purpose may 
have to be looked fo~r. The title should neither be too small 
nor ambiguous. The column heading should indicate the unit 
used, as ^ height in inches ^ ‘ price in rupees \ ‘ weight in tons \ 
Large digits may be approximated and mentioned in thousands, 
lakhs or millions. A table may show absolute figures, increavses 
or decreases from past years' figures, percentages etc. according 
to the nature of comparison to be made. All items should be 
carefully checked before entering in the table or totalling 



CT^SSIFICATION AND TABUIATION OF DATA 


91 


them up. Arranj?ement should also be made in the table for 
testing a cross-checking. There should be no over-writing, 
otherwise the neatness of the table would be lost. A written 
analysis pointing out principal conclusions and possible errors, 
with probable reasons for them, should accompany the table. 

Different types of Tabulation. 

In very general terms tabulation may be distinguished as 
simple and complex. A simple table contains data respecting 
one characteristic only, information relating to other character- 
istics being left out. Table 1 is a case of simple tabulation. A 
more complex table, may contain figures relating to several 
characteristics. Tables 2, 3 and 4, represent this type. 

Tabulation is also classified as Single, Double, Treble and 
'Manifold. A siugle tabulation is one that answers one or 
more groups of independent questions. The following table 
gives the frequency distribution of marks obtained out of a 
maximum of 50 by the students of a class in their test in 
Economics. 


Table 1. Frequency Distribution of marks in Economics, 


Marks-group 

Number of students 
(frequency) 

0-5 

4 

5-10 

6 

10-15 

10 

15-20 

16 

20-25 

12 

25-30 

8 

30-35 

4 


The table is capable of furnishing a first approximation to 
the answer to an inquiry into the ' ordinary marks ' obtained 
by the students in the test. It tells, for instance, that the 



92 


statistics: theory and practice 


number of sudeiits getting marks between 15 and 20 is the 
highest as compared with the number in any other group. A 
still simpler table will be one showing yearly variation of 
something, say, progress of cotton mill industry in India or of 
trading profits of a company. 

Double tabulation shows the sub-division of a total accoi*d- 
ing to two categories, and is capable of answering two mutually 
dependent questions. Table 2 is an illustration of this type 
of tabulation. It shows the distribution of 376 iiidustrial 
disputes of the year 1941 into (i) different kin<ls of mills and 
(ii) different quarters of the year. 

Table 2. Industrial Disputes in India in 1941. 


Tmlustrial Disputes 


For quarter 

-- ' 

. ... - 

- 

if o 1 

ending 

Cotton 

Woolen mills 

Jute Mills 

others 

1 or <11 

31st March . . 

30 

1 

40 

71 

30th June 

52 

3 

66 

121 

30th September 

34 

3 

41 

7H 

31st December 

' 33 

1 

10 

63 

106 

Total 

j 149 : 

n 

210 : 

376 


Treble Tabulation sub-divides a total into three distinct 
categories and amswers three mutually dependent questions. 
Table 3 is a blank table to illustrate treble tabulation. It 
shows the distribution of India's population into urban and 
rural, for main religions, in British Provinces, and States and 
Agencies. 




C1.ASSIFJ CATION AND TABUJ.ATION OF DATA 


93 


Tal)le 3. DislrihiUion of Indians population into Urban and 
rural according to main religions in Provinces, 
— and States and Agencies. 


: 


Provinces 

; States & 

! Agencies 

Total 

Keligion ; 

cS 


IS 

+-> 

^ 1 

4 - 

cS 

pf 

1 ^ i 

! £ i 

1 

CS 

4-» 



, « 


c ; « 



1 M 1 

H ■ 

Hindu 




1 



i 1 

; j 

1 : 


Sikh 1 




' 



; 


! 

Jain I 









Buddhist 


j 



i 


t 


Zoroastrian , 





1 




Muslim 


j 



j 

j 




Christian ' 


i 



1 

i 

: j 


Jew ! 


! 

1 


' 

1 


1 



Tribal 

1 , ^ i 1 

others 

I ; . 1 ' * ! 

TOTAL 

i i ' ' ! ' i 

' i I i i ' ' : 


Manifold Tabulation is one that divides a total into 
several categories, generally more than three. The following 
blank table drawn to show the distribution of population in 
British Indian Provinces according to age, sex, literacy and 
caste illustrates manifold tabulation. 





94 


statistics: theory and practice 


Tabl« 4. Distribution by sex^ literacy and caste in 

provinces of British India, 


Province 


Bengal 


Caste 

Kayasth 


Age- 

Oroup 

0-25 
25-50 
50-75 
Over 75 

Total 

0-25 
25-50 
50-75 
Over 75 

Total 


All ; 0-25 

Castes ! 25-50 


Brahmin 


Male 


_i-- 


Female 


Total 



; 50-75 ; 

! Over 75 ; ; 

1 1 

i ' 

i 

1 



' Total 

1 i 



Bihar 

1 

Kayasth 0-25 

; 25-50 

50-75 . 

i Over 75 ’ i 

i ! i ' 

1 

i 

J 



i ! Total i 

1 i 

('ill 

i 



EXERCISES 

(1) Define Classification and ^Tabulation and show their 
importance in statistical studies. 





CLASSIFICATION AND TABULATION OF DATA 


95 


(2) “In collection and tabulation commonsense is the chief 
requisite and experience the chief teacher “ — Bowley. 

Comment upon the above statement. 

(S) What different types of tabulation do ybu' know? Indi- 
cate their characteristics. 

Draw up blank tables to illustrate your answer. 

(4) FiXplain the Temporal, Spatial, Qualitative and Quanti- 
tative bases of classifications. 

(5) Point out the mistakes made in the following blank table 
drawn to show the distribution of population according to sex 
and literacy in five towns in the U.P.: — 


Number of 
Literates 


Number of 
Illiterates 


(6) Re-arrange the following blank table with a view to 
make it more intelligible. 


Males 


Females 


I a 

; 


^1 1 ' ^ 2 -s 
d ■ ^ . 33 ” R 

2' 2 ic 


SS ' OS 

c , 



Brahmin , Rajput 

Kayastha 

Harijan 

Sex 


1 

! ^ « 


V 


V . 



cs 

* u cS 

P 


cS 

w 

os 

u. 


a 













...H i 

(—4 

Male 


\ ; 




i 

i 

Female 


i 

i 

i 




1 1 

! i 



(B. Com., Alld., 1940). 






96 


statistics: theoky and pkactice 


(7) Draw up in detail, with proper attention to spacing, 
double lines, etc. and showing all sub-totals, a blank table in which 
could be entered the numbers occupied in six industries at two 
dates distinguishing males from females, and among the latter 
single, married and widowed. 

(M.A., AIM., ItHO). 

(8) Prepare a specimen form in blank, with suitable lu-ading 
and spacing, for use in collection of data on (me of the following: — 

(rt) Survey of trades in your districts. 

(h) Standard of living of middle class families in a small 
town. 

(c) Expenses of students in a University. 

(Dip. in Econ., ^^adras, 1931). 

(.9) Explain how you would tabulate statistics of deaths 
from principal diseases by sexes in difiVrent provinces of India 
for a period of five vears. 

(B. Com., Cal., 1937). 

(10) Prepare a table with a proper title, divisions and sub- 
divisions to represent the following heads of information: — 

(а) Imports of cotton piece goods in India. 

(б) From U. K,, Netherlands, Belgium, Switzerland, Italy, 

Straits Settlement. Japan. 

(e) Amount of piece goods from each country. 

{d) The value of goods from each country. 

(e) Pre-war average, war average. Post-war average, 
1924-25, 1925-2d, l,92()-27, 1927-28. 

(/) Total amount imported during each period. 

(/y) Total value of imports during each period. 

(B. Com., Luck., 1930). 

(11) Discuss the function and importance of tabulation in a 
scheme of investigation. 

Prepare blank tables, showing the distribution of the students 
of a University according to age, class, and residence, for arrang- 
ing (a) physical training, and (b) seminar classes. 

(12) Prepare blank table to show the distribution of popula- 
tion according to sex and four religions in five age-groups, in 
seven important cities of U. P. 


(B. Com., Agra, 1937). 



CI>ASSI FI CATION AND TABUIATION OF DATA 


97 


(13) Draw up two independent blank tables, giving rows, 
columns, and totals in each case, summarizing the details about 
the members of a number of families, distinguishing males from 
females, earners from dependents, and adults from children. 

(M.A., Cal., 1935). 

(II) What is a statistical series? 

Differentiate betweem eontinuous and discrete series. Give 
illustrations. 

Also distinguish between ordinary and cumulative frequencies. 

(15) Write short notes on: 

Frequency table, Frequency, class-limits, class-interval, magni- 
tude of class-interval, inclusive and exclusive methods of clasiSi- 
fieation, treble tabulation, manifold classification. 

(!(>) What are the essentials of a good statistical table? 
What rules and precautions should be observed m drawing up a 
t ablt‘ ? 

(17) Following are the heights in inches of 53 students of 
a class. Tabulate them by grouping them in class-intervals of 
five inches: — 

58, 5(), 57, 57, 52, 53, 56, 51, 49, 48, 47, 48, 49, 60, 62, 
46, 51, 60, 50, 53, 54, 55, 56, 67, 56, 54, 59, 60, 47, 48 
61s 65 ■ 63, 46,. 62, 61, 52, 52, 53, 52. 56, 54, 52, 52, 63, 
55, 48, 50, 61, 62, 66, 61, 52. 

(18) Prepare a blank table to show the value and quantity 
of difi’erent kinds of cotton goods imported into India from 
diflerent countries of the world during the past six years. 

(19) Classify the following according to attributes: — 


(a) occupations in India, (b) exports and imports of India, 
(c) wants of university students, {d) b(K>ks of a college library, 
(c) religions of India. 


(20) 

Following are the 

weekly earnings of labourers in a 

factory : — 



Earnings 

Rs. A. P. 

No. of labourers 


4 6 0 

. . 25 


10 10 6 

. . 6 


6 9 3 

. . 35 


6 10 

. . 42 

• F.- 

~7 




98 


STATISTKVS; TIIEOUV AND PKACTICK 


and 


Earnings 


Rs. A. 
d 11 

7 12 

9 9 

8 2 
10 0 

5 0 


P. 

9 

:i 

0 

0 

0 

0 


No. of labourers 


HO 

21 

17 

15 

S 

52 


Tabulate the abo^e data in elasses of Rs. t-5, Rs. 5-() ete. 
give a suitable heading to the table. 


(21) Draft tables to show the distribution of population by 
(/) age. sex and eivil condition. 

(h) sex and infirmities. 

(///) sex and occupations, 

(iv) age, sex and literacy. 



CHAPTER IX 


SIMPLE DERIVATIVES 

After the eolleeteil data have heeii edited, classified and 
tabulated, the resultiri^^ table, thoiijrh compressing an unwieldy 
data to a lar^e degree, shall not be easily grasped or compared 
with other tables, mei'ely because a table contains a number of 
entries. Some method of concisely describing the data has; 
therefore, to be devised. The most simple method is that of 
computing cei'tain derivatives from the data. A statistical 
derivative is a quantity resulting from a combination of two 
or more original figures. Therefore, it should be remembered 
that statistical derivatives do not arise from simple measure- 
ment 01* counting, l)ut always from computation, 

J)ei*ivatives are ))oth sim])le and (mmplex. We shall later 
set* how statistical averages of the first order’ are derived from 
the original data. These averages are complex derivatives, 
and so are those of tin* second oialei* — the measuii's of dis- 
ptrson.- Simple derivatives consist of relative numbers. 
Two r(*lat ionships are distinctly marked: Subordinate and 
Po-o!*dinate. Ac(*ordingly, there a!*e sub-ordinate and co- 
ordinate* <ierivatives. 


Subordinate Derivatives, 

Subo]*(linate derivatives are those which show the relative 
size* of i)ai'ts to a whole. They are generally expressed as 
proportions or percentages, e.g. the proportion or percentage 
of ai'ea under food-erops and under non-food crops to total 
area eultivated. Here the total is divided into two categories. 

Clmptcr X, on Statistical Avcraffcs, 

^ See Chapter XI, on Dispersion and Sketrness, 

99 



100 


statistics: theoky and practice 


Co-ordinate Derivatives. 

Co-ordinate derivatives are those wliieh show the lelative 
size of pairs of inter-related co-ordinate masses. These include 
several varieties : 

1. The Simple Difference between two quantities of like 
kind, e.g. comparing this year’s production of sugar in India 
with past year’s. 

2. The Percentage Difference, the difference being 
expressed as a percentage upon s(une cpiantity taken as 
standard, e.g. this year’s production is so much per cent, higher 
or low’er than that of the past year’s. 

3. The Ratio. It is another way of expressing the per- 
centage difference. Instead of saying wdiat we did in case 
of percentage diff'erence, we may say that production (»f sugar 
this year has increased from the past in the ratio of 100:120, 
or 50:60, or 5;6, all ratios being identical ways of expression. 

4. The Rate. ^^eneraJly speaking, when the two 
quantities to be compared are of the same kind we use the term 
ratio, e.g. ratio of boys to girls in a university. Here i>oth the 
quantities are the same, viz. student.s of the same institution. 
But when the numerator and denominator are of different kinds 
we speak of rates, e.g., .sickness rate, marriage rate, mortality 
rate. Here the comparison is between quantities of different 
kinds. 

Further, a rate is usually standardized in regard to the 
denominator. One mass is divided by tlie other relate^i mass 
and the quotient is multiplied by 100 or 1,000; and, we speak 
of rate per cent, rate per mille. 

But the distinction between rates and ratios is nol rigid. 
We speak of birth rate, i.e. number of births per 1,000 popula- 
tion. We may equally correctly speak of the birth ratio, i.e. 
ratio of the number of births to the number living. 

Rate per unit is called a statistical co-efficient. If tlie birth 
rate is 30 per 1,000, the co-efficient is .03. The characteristic 
of this co-efficient is that if it is used to multiply a total (e.g. 



SIMPI.E DERIVATIVES 


101 


population) an allied nuuil)er nuniber of births) would 

be obtained. 

Purpose of Computing Statistical Derivatives. 

Simple derivatives are computed to compart statistical 
groups. Jn the computation of rate per cent, or rate per mille 
observations are reduced to a common denominator and com- 
parison is thereby facilitated. If in University A 900 candi- 
dates were successful out of 1200 who appeared, and in 
University B 980 passed out of 1400, a comparisoni of the 
absolute number of successful candidates, — 900 and 980 — 
without considering the number of those who appeared at the 
examination, would lead one to declare the result of the B 
Xhiiversity as better than that of the A. But. when the two 
results are reduced to a common denominator, say, expressed 
in percentages, this impression will be reversetl. The percent- 
age of successful candidates in A is, 

lOOXNumber of successful can didates 100X900 

Number, appeared ” 1200 ”” 

Similarly, the percentage in B is, 

lOOXNumber of successful candidate 100X980 

Number appeared ” 1400 * ' 

Both the results have now been reduced to a common denomi- 
nator, 100. It is evident that the percentage of success in A 
University is higher than that in B University. Therefore, 
A’s result is better than lUs. The usefulness of relative 
numbers for purposes of comparison is thus clear. 

But relative numbers can also be used for another purpose, 
viz., computing the size of an unknown mass from a known 
one. The known mass, i.e. the relative number, may be an 
actual figure or an estimated one. Relative numbers are often 
estimated when there is no sufficient data for their computation. 
Such estimates can be employed to know the size of the mass 
to which they relate. Statisticians have often used them to 



102 


statistics: tukoky and phactke 


obtain the population figures of past times. If we know, 
historically, the number of artisans or l)e^‘j>*ars of a city or 
country, we may make an estimate of the percentajL»e of the 
entire population that the artisans or be^^ars pFobably 
formed, on an averaj»‘e, at the time under cojisideiation, and 
hence compute the total population foi* that time. Tin* (‘stima- 
ti(m of the artisans or tbc be^jzars is mad<* possible by tiie 
law of statistical regularity: the ratio between dcfinit(* 
statistical masses is often fairly (*onstant. This ratio can he 
easily estimated to be within ce»-tain limits. 'Phis liohls ^uod, 
foi* example, foi* the relationship between })opulation and 
births and deaths. Every estimate, however, must be i-ej^arded 
as sim))ly an api)roximate valu(‘. It )nay or may not be 
accurate. Therefore, tin* size of an unknown mass eoiii])ut(Ml 
fi'om an estimat(‘d relativ(‘ number must also b(‘ r(“^aed(‘d as 
merely approximate. 

Ikit, whei*e actual ])e}*<'enta^es a?-e kiiowr) tlu^ luass or 
population to whi<‘h they relate can be ascertained to a con- 
siderable deji'ree of pi-ecision. For, }:iven that the numbei* of 
successful candidates at a certain examination wax 900, an i 
this constitute<l of the total, it is easy to see that tiu‘ 

total number of candidates whv> appeared at the examination 
was 1200. 

Derivative Series. 

A set of relative numbers or simple derivatives of the same 
kind spread, say, over a period of lime would conslitute a 
derivative series. The special featiu'c of a derivative series 
is that it eliminates the factor iu* factoi-s obsti-uctinu cffectiNU* 
comparison: all figures are related to a common denominator 
and comparison is facilitated. A number of figures re- 
presentin»>- burden of income-tax per head of population over 
a period of time forms a derivative series. The series would 
eliminate the main effects of demographic chan<>‘es. Popula- 
tion on which tlie burden is computed may chan^(‘, yet the 



SIM PI ,K DKKJ VAT J VKS 


103 


l)Ui‘<Ieii per hend of po])iilation for oih* year shall he eomparable 
wilh similar bin*(le!i for aiKilher year. The actual fijiures of 
population ami amount of tax would not be so easy to compare 
from year to year because of their Huctuations. 

The t(*s1 of a dej-ivative series lies in its stability, which 
is d(‘termin(‘d by measin*es of dispersion. These measures shall 
b(‘ discussed later; but it will b(‘ useful to note here that hi^hei’ 
the d(‘^r‘(M‘ id' stability, ^r<‘ater is the I'cliability of a deri\ative 
sei-i(‘s. 1'o attain stability some simpb‘ ])re(*autions sin uld be 
k<‘})1 in viv'w iji (‘omj)utiim slm))l<‘ derivative's. 

Rules and Precautions for computing* Derivatives. 

Ltolh, the eom])utat iori ami the use of relativt» mimbers, 
]i(‘(‘d eautioFi. The rub* shoubl 1>< to ])ri)CU]*(* as much 
homouc'iu it y in tin* data as ])ossibie. rbn* examj>le. ^eiu*i*al 
(bath i-ate f(n* a town or a (‘oiintry may be computed b,y multi- 
plying the numlx'r of (b'aths by 1.000 and dividinji’ the 
product by the total luijuilat ion : but, this j^iern'ral death-rate, 
or (*rude death-i*ate as it is called, ]*elates to heteroj^eneous 
mass, since deaths vary with aj»e and sex compositions of the 
j)o])ulat ion. We shall latei see (iti chapter X) how this 
defect (*an be remeilied. 

In order that th(*re may be no misunderstandinji, the basis 
of calculation of the relative numbers should ahvays be given. 
If we are told that the price of a commodity inc]*eased U) per 
cent., decreased 15 per cent., increased 25 pei* cent., decreased 
20 per cent, and then increased 15 i)er cent, over a period of 
time, it woidd be d:ffi'cult to say w'hat exactly the change over 
this period was. If the changes w'^*re based on the oi*iginal 
price it Avill be found that the change over the wdiole ]>eriod 
was 15% on that price. If, however, the changes w'ere based 
on the prices ruling at the time of each particular change, the 
change over the xieriod w(»uld be found to be abi)ut 7.5% of 
the basic price. If the basis on wdiich percentages wTre calcu- 



104 


statistics: theory and practice 


lated was known this variation in result wi)uld not have 
occurred. 

A^ain, Percentages, or other relative nuiu]>e.rs, should 
l>e used only when the factors which are to be expressed 
in percentage form are themselves comparable. Jf a com- 
pany whose issued and paid up capital was lis. 150,000 
earned profits at a uniform rate of Rs. 15,(X)0 per year 
for five years, its percentaj»e of profits to capital would 
be 10 for the period. Suppose the company increased 
its capital to Rs. 250,000 in the sixth year and the profits 
increased from 15,000 to Rs. 22,500 in that year, the percentage 
of profits to capital would be 9 only. Then, if a table showinj^ 
only the percentai^e profits was prepared for the six years, 
it mi^ht lead one to conclude that profits had declined in the 
sixth year while the actual profits had increased. In such 
cases it is advisable to show the amount of capital over the 
different years, the total profits earned from year to year 
and in the last the rate of percentage profits. This would 
avoid all fallacy. 

Lastly, percentages should not ordinarily be used when the 
number of items in one of the series to be compared is less 
than one hundred. Similarly, rates per thousand or rates per 
ten thousand should not be computed when the number of 
items is comparatively very small. Advertisements very often 
appear in the newspapers: Join X school. This yearns 

results 100 per cent.’' Another institution may have a. percent- 
age of only 92. A prospective candidate may be led to think 
better of the first institution. But, if on an imiuiry it 's found 
that only 3 candidates appeared from X school and all got 
through, while 2e50 candidates appeared from the second 
institution of which 230 came out successful, opinion will have 
to be reversed, for it is always more diffi'cult to get the same 
percentage result from a much larger number. Mathematically 
both the percentages are absolutely correct, but statistically 
the percentage relating to the result of X school is not signi- 



SIMPI.E DERIVATIVES 


105 


ficant. It is, therefore, a^ain • established that comparison 
throu^j-h peroentaoes alone is not sufficient unless the data on 
which they are based are honioj^e neons and capable of 
comparison. 

Ratios. 

To eliminate the chances of fallacious conclusiv)ns. which 
mi^^ht result from the use of percentages when theii* bases of 
calculation are not specified, some statisticians emphatically 
recommend the use of ratios in place of percentaj?es. Then we 
shall say that the price of the commodity increased in the 
ratio of 100:110 rather than that it increased 10%. 

But ratios must also be used with caution, otherwise 
wrong* inferences might be drawn. Suppose 800 candidates 
appeared at a certain examination of which 600 came out 
successful. The ratio of passes to failures is, therefore, 6:2. 
Further supposing college A coached 500 out of the 800 candi- 
dates, and of these 400 passed so that the ratio of the success- 
ful to the failed candidates is 4:1, whereas of the remaining 300 
candidates 200 must have passed which gives the ratio of 
successes to failures as 2:1. It might appear that college A 
achieved twice as good results as all those colleges taken 
together through w^hich the remaining 300 candidates appeared. 
This conclusion i» fallacious as it is not known whether 
all the 300 students were given adequate coaching or 
they simply appeared through certain colleges. If they were 
not properly coached they did not stand the same chances of 
success as those 500 candidates who were given due coaching. 
It is advisable, in such eases, to show the ratios of all the 
coaching institutions in order to ascertain which of them was 
really the best. 

Use of Simple Derivatives. 

Ratios, rates per unit, per hundred, per mille are widely 
used and easily understood. Sex ratio, cost per unit of output 



I Of) 


STATISTU’S: THKOKY AND PliACTK'K 


income per capita, percenia}>e rate of interest, }>ereenia^‘e ol 
exports or imports to tot'il trade, hii‘tli and marioaf^e ratios 
per thousand, mortality rale per ten thousand, J)urden of tax 
per head of ])o]ndatiin}, tiel i‘eprodu(‘t ion rate iU'c 

very (‘ominonly us(m1 derivfil ivt‘s niid t ht‘ \'<'i .'‘i.'l.y of 

fit'ld to \vhi(di they an* a|)|)lic*d. Thoy fu*( very coniiiioidy us<*(l 
in ])nsi!K‘ss, sot*ial and admijiist i-ativo statist ics. 


KXEIU ISKS 

(1) Dffiiu* a statistical derivative, and ])oint out the ustd’ul 
ntss of its eompiitation in statistical studit s. 

(2) Clearly dist inji^iiish between Sidx'rdinal e and C’o-ordi- 
nate derivatives. 

( ) What j)ur})oses do statistical deri\ ati\t s serve ? De) the y 

the whole i?iformation about the series from which they 
are derived.^ 

(t) What is a deri\ative series? How does il (litter from 
( 7 ) series (»f iiidi\ idual observations and (ii) frccpieney distri- 
butions? 

(5) What rules and precautions will you observe in eoniput- 
ing percentages, and why? 

(tt) What are ratios? Wliy are they eonsidert*d as better 
than percentages ? 

(7) What precautions are necessary in using ratios and 
percentages? 

(8) Explain what you understand by (a) amount of lax j)er 
tax payer, (b) burden of tax per head of population, (c) net 
reproduction rate, (d) income per capita, (c) yield jht acre, if) 
cost per unit of output. 

( t) j Write a note on the im))ortance of sim})le derivatives to 
businessmen, ])rofessional speakers, legislators and layman. 

(10) Point out the ambiguity or mistake, if any, in tlie 
f o 1 low ing st at ement s ; — 

(1) 'I’he deah-rale in the American navy during tin* 
Spanish-American war was nine* per thousand while 



SlMPl.K J)EKJVAT1VES 


107 


in the oitv of New York for the same period it was 
sixteen per thousand. It was safer, then, to be a 
sailor in the Anieriean navy than to live in New 
York City. 

(2j i:i-, of tht‘ total po))uIation in lJ)tl in India was 
urban as a,y;ain.st 11°,, in Tlierefort*. tht‘ 

number of towns in Iiidia has eonsiderably inereased 
during' the deeade. 

(.*i) Population ot India has inir{ ased 15"^, in IPtl over 
the ])opulation in 'I'litreloiM', the eonsuinption 

of food grains })er head has fallen in IPtl. 

(I j 'Pile iiK-rease in the wages ef a latjour('r was 

'I’lien the wage decreased 25 , , and again inertased 
'rh<*refore, tlie resultant inertase in the 
wages was 10 . 

(5) C\»ws art* inuiti})lying faster than human lx ings in India. 
'Phe consumption v)f milk by human beings is there- 
fore increasing. 

(()J 50 candidates aj)peared from a college in the H.A. 
examination, of which 00 were suecessful and two 
obtained first division. From another eollege 5 
candidates appeared at the same examination and 
80% passed, none being placed in the first division. 
'I'ht* latter eollege showed a better result than the 
former, 

(11) 'Phe following table shows the growth of India's popu- 
lation as recorded by sueeessive censuses: — 


Year 


1 872 
1881 
1801 
1001 
1011 
1021 
1081 
lOtl 


Population 
(000,000‘s omitted) 
. . 210 
. . 250 

. . 200 
. . 205 

. . 815 

. . 820 
. . 858 

. . 880 


('aleulate the percentage increase for each successive year 
over the preceding year's population. 



CHAPTER X 


STATISTICAL AVERA(IES 

Simple statistical derivatives, by themselves, are 
insufficient to give a summary description of the peculiarities 
of a series, nor can they be used as types representing the 
series. They throw light only on the relative aspects of a 
series, and are not characteristic of the data. Therefore, 
some other method of j^recisely and concisely describing the 
series has to be devised. This is the method of statistical 
averages. Through it a number, representative or character- 
istic of the entire group, is computed, which affords the 
central idea of the series and can be used in place of the data. 

Averaging is the process of condensation. An average is 
a single simple expression in which the net result of the whole 
series is concentrated. It is a number, representative of the 
group, its gist. An average brushes off the irregularities of 
a series, levels all differences of ihe individual items and 
presents complex data and unwieldy numbers in a few signi- 
ficant figures. It thus gives a bird’s-eye view of an aggregate, 
and can be substituted for individual items in further calcula- 
tions regarding the series. 

An average is a typical item to represent a group. It is, 
therefore, also called tjrpe. A type w'ould naturally describe a 
group better than any other value. One object of a type or 
average, then, is to give a concise picture of a largo group, to 
describe the series it represents. Another object, which 
follows from the first, is to afford a basis of comparison with 
other groups. It follows that an average may be computed 
for its own sake, or as a means to another end which may be 
comparison, or measurement of dispersion^ or of skewness^ of 

For Dispersion and Skewness see Chapter XI. 

108 



STATISTICAL AVERAGES 


109 


the series. This is quite obvious. It is difficult to grasp the idea 
if we are giA^en the age of every person in a country, but the 
average age of the people of the country is something definite 
and intelligible. Similarly, two series each containing ages of 
different people in two countries, even if compressed into a 
few magnitude classes, will not afford a comparison between 
the ages of the people in the two countries. If, however, some 
sort of average of the tw^o series is computed, average ages 
in the two countries shall be comparable at a glance. And, 
comparison of the average ages shall usually be equivalent 
to comparison between the two series. 

Homogeneity of Data. 

It is necessary to say that the data from which averages 
are computed must be as largely homogeneous as possible. 
For, if the different items are not alike in relevant aspects there 
is no sense in grouping them together, and consequently, no 
justification for computing their averages. The comparison of 
only those averages yields reliable conclusions which refer to 
homogeneous masses. The comparison of averages computed 
from heterogeneous series may easily mislead and is only 
reliable under certain special conditions. The significance of 
av^erages lies in Ihe tact that they exhibit the result of the acti- 
vity of complex causes in one characteristic figure. The 
average wage, for instance, of a given group of workers gives 
a measure of the factors determining wages in that group. It 
is important that this average should refer to as unified com- 
plex of causes as possible, since then alone will it be reliable 
for purposes of comparison or as a type. If the wages in two 
factories are determined by quite different causes, the average 
wages for all the workers in the two factories shall not yield 
a trustworthy comparison. Hence the importance of homo- 
geneous series. 



110 


statistics: theory and practice 


i loino^'ciieity can he attained hy (1) eJiniinatin**’ the unlike 
from the like items and (2) dividing the like items into groups 
as nearly homogeneous as possible. In our example, it will 
he necessary to disregard the cases in which workers for 
personal reasons do not receive any wages at all. Among those 
getting wages, workei*s may he distinguished whose wages are 
influenced by different and independent causes. They will 
have to he divided into more homogeneous jjarts, so that the 
averages may refer to an unified complex of causes, and thus 
he reliable foi- com])arison ])etween two periods or two places. 
Our wage-eai*ners may consist of males and females, the latter 
getting lower wages. Tlun*efore. male workei*s will tiave to 
he ;^ei)a rated from female workers, and se])aratt‘ <jverages 
computed. There is still a possibility of these two grou]>s 
being sub-divided into more homogeneous pai’ts, e.g.. skilled 
male workei*s and unskilled ones. The extent to which 
homogeneity should he attained shall, however, he determiiual 
hy the ])urpose for which averages are r((iinr(‘(l. Wv heai* of 
the average tax per tax-pay(‘r as also of aveiaige lax i)er head 
of population, J>oth have <lift’erent ■|)in‘poses : the first is a 
measii]*e of the burden on the tax-])ayers only, the second of 
that on the whole population. 

Kinds of Average. 

1'he following four kinds of averagi*. or mean, ai’c in 
common use: 

(I) The Mode, (2) The Midian, (o) Th(‘ Arithmi‘lic 
Average, and (4) The (leometric Mean. 

In addition, thei'C ar(‘ other forms of average such as the 
Harmonic Mean ami the (Quadratic Mean; hut they are not 
in common use. 


THE MODE 

The mode is the value of that item in a variable which 
occurs most frequently or is repeated the greatest numl)er of 



STATISTICAL AVEUACJES 


111 


times. It lies at the position of j?reatest density. It is the 
typieal moasureiueiit or most fashionable point. It is the 
usual, and not easual, size of item in a series. When we 
speak of the avera^^e student, the avera^»e wajire, the average 
rent ete., we generally imply the modal student, the modal 
wage, the modal rent. If we say that modal marks obtained 
by students in a class are 40, we mean that 40 are the predomi- 
nant marks, i.e., the largest number of students secured 40 
marks. As high as 70 marks and as low as 15 marks aie 
exceptions; they are much less fre(iueuted ; they are non- 
modal. 

Sin(*e mode is the most fj-equent size, it appears it is easy 
to locate it. Keally it is so, if there is a single well defined 
mode. Then tlie size of item, oi* the gi'ou]) containing the 
maximum fn^iuency, can be easily located in a frequency table. 
r>ut, it is not improbable that there may be nume!*ous irregular- 
ities in the 1a])l(‘, so that the positit>n of lht‘ modi* becomes 
ind(‘finite and modal size is not easy to locate. In sui'li cast*s 
1 rc([uenciv*s ai*c ai^ljusted by the proee.ss of gr*ou]>ing, i.e. ))y 
widening- the groups into which the fieipicncies fall until 
a modal size ot* it-em or a modal group pi-esents itself. This 
modal size is the mode in a discrete series, while in a ‘ontinu- 
ous series the mode will be located by inter polatioti in the modal 
group on the a.ssuihption that the freipiencies of the groups on 
either side of the mode influence the mode in proportion to 
their I'cspectico numbers. 

Location of Mole: Discrete Series: — 

Example 1. Required to find the mode in a discrete series. 



112 


statistics: theory and practice 


Table 5. 

Location of Mode by Grouping. 

Size of item 

Frequency 

m 

/ 


; i ' 

(1) ; (2) 1 (3) 1 (4) . (5) 


I 



The frequencies given in column (1) are first grouped by 
two's in columns (2) and (3), and then by three's in columns 
(4), (5) and (6) and the maximum irequmcy in each column 
is indicated in heavy type. But no fixed point where frequency 
may be the largest is obtained. The mode seems to change 
with change in the grouping. According to column (2) it 
may be 1€ or 11, while according to column (5) it may be, 8, 
9 or 10. 'the following table shows the sizes of maximum fre- 
quency in different columns. 



STATISTICAL AVEKAGES 


113 


Tabl^ 6. Analysis Table, 


Column 

j Size of item containing 

1 maximum frequency 

(1) 



! 

1 

11 


(2) 



10 

11 


(3) 


9 

10 ; 



( 4 ) 



10 i 

11 

12 

(5) 

8 

9 

10 ' 

t 


i 

(6) 


9 

10 ; 

11 


No. of times 

1 

1 

1 

3 1 

5 

4 i 

1 

1 


From the above table we find that 10 is the size of item 
which is most frequented. It is not true of any other size. 
The mode is, therefore, located at 10. 

A glance at the frequencies in column (1), table 5, might 
lead one to think that size 11 is the mode since it contains the 
largest frequency in that column. But this impression is 
corrected by the process of grouping which clearly shows that 
mode is infiuenced by the frequencies of the neighbouring sizes. 
It is, therefore, evident that it is not always easy to locate the 
mode by mere inspection. Inspection can give reliable results 
only where the frequencies run fairly regularly and mode is 
unquestionably clear and well-defined. We shall see it in the 
following example. 

• F .— 8 




114 


statistics: thkoky and practice 


Continuous Frequency Distribution: — 

Example 2. Required to locate the mode in a continuous 
series whose frequency distribution is ^iven in table 1. 

We find that the figures in the said table are fairly 
regular and the 15 — 20 marks group indisputably con- 
tains the maximum frequency. Without any prelimi- 
naries we can say that the mode lies in 15 — 20 marks 
group. If we group the data, first in 10 marks group 
and next in 15 marks group, and then prepare from the fre- 
quency columns an Analysis Table as we did in example 1 
above, we shall arrive at a similar conclusion. Having known 
the class containing the maximum frequency we shall locate 
the mode in that class on the assumption already noted, viz., 
according to the weights or influence of the neighbouring 
groups. 

The formula for it will be as fellows'* : 


Where Z stands for the mode^, 

lx and I 2 stand for the low'er and upper limits of the 

modal group, 

/i stands for frequencies in the modal group, 

fo y, „ „ group preceding the modal 

group, 

/2 r V „ ■ group succeeding the 

modal group. 

Applying the above formula we have, 

^ . 16-10 , 

Z-15+^2-10-12 (20-15) 

= 18 mark.5. 


’ This formula, it is held, is more accurate than the customary 
formula; viz. — 

•In all our further calculations mode will be represented by Z. 



STATISTICAL AVERAGES 


115 


The process of groupinf? is, in fact, a method of smoothing 
out the irregularities of the series and can be profitably em- 
ployed even in a series which does not appear to be irregular. 
Rather, it should be resorted to in all elementary work. 

In exceptional cases, where the distribution of frequencies 
is very irregular, two or more groups on either side of the 
modal class-interval may be used as weights. But recourse 
should not be had to it if the series is multi-modal. It should 
be noted that all series do not possess a single or even a well- 
defined mode. Some are bi-modal, some tri-modal, i.e. have' 
more than one mode, while others can hardly be said to possess 
a mode at all. Therefore, efforts should not be wasted over 
forcing the appearance of an exact mode when, in fact, one 
does not exist. True mode should not be expected in a series 
which is markedly asymraetrical''\ 

Advantages of the Mode. 

1. It is easily understood and has a general and precise 
usage. 

2. Jt eliminates extreme (and therefore abnormal) 
variations. That is, its value is not affected by stray items 
differing much from it in their values. 

3. For its determination it is not necessary to know the 
extreme items, except that they are few. Only the size of the 
middle items need be known. 

4. It refers to a measurement whose expectation in the 
series is the greatest. That is, it is the most likely and not 
isolated example. 

5. It can be located by mere inspection in certain cases. 

Disadvantages of the Mode. 

1. It is frequently ill-defined and indefinite: A modal 
size of city may convey any meaning. 

• “For precise meaniag of asymmetrical series see Chapter XI. 



116 


statistics: theory and practice 


2. It is often indeterminate and, therefore, difScult to 
locate. 

3. It is incapable of being located by any simple arith- 
metical process. 

4. It rejects all exceptional instances and is, therefore, 
not useful in those eases where weights are to be given to 
extreme Tariations. 

5. It may not be fully representative of a group in which 
items of uniform size are comparatively small. For instance, 
if in a community of 20(1 people only 4 people earned Rs. 30 
each while the earnings of Ihe rest were at any figure other than 
Rs. 30, and no other four people received an equal amount, 
Rs. 30 would be the modal earnings simply because they were 
earned by the maximum number of people. This difficulty, 
however, can be got over by using class-intervals of consider- 
able magnitude. 

6. It is unsuitable for further algebraical treatment. 

7. Mode multiplied by the number of items does not yield 
the total value of the items. 

Uses of the Mode. 

The concept of mode is readily intelligible, and is applied 
in many cases in daily routine, though involuntarily. We 
often hear people say ‘ Average calls on my telephone are 
15 a day \ * Average size of shoe sold at my shop is such and 
such \ ' The average page contains 300 words ' The avei*age 
student spends Rs. 50 a month In all such cases what they 
mean by the average is really mode, that is, the likeliest figure. 
If we are required to guess the average of a certain pheno- 
menon, we shall generally,, and rightly too, guess the mode, 
the dominant or prevailing size. 

The use of mode is now increasing in business. It serves 
as a reliable guide in business forecasting. It is being realized 
that it is of great value in studying output. Modal output 
per machine can be ascertained by recording the output of 



STATISTICAL AVERAGES 


117 


similar machines and finding the output which is more or less 
the same over a period of time. If this modal output is left 
far behind in subsequent years, reasons for it may be traced 
back to defects in machines or their handling, lack of skill on 
the part of operatives, or any other inefi&ciency. This useful 
work might remain undone, or its urgency may not be felt, in 
the absence of a knowledge of the modal output which acts as 
the standard for comparison. Similarly, modal time for pro- 
ducing a commodity may be ascertained which would stand 
for the most likely time that would be required to turn out 
similar goods under similar conditions. On the basis of this 
modal time cost of producing a certain number of commodities 
may be calculated. Mode, thus, has great potentialities for 
being employed in business and commerce profitably. 

Meteorological forecasts, which are proving very important 
to mercantile and other interests, are really based on the mode. 

THE MEDIAN 

Median is the value of that item in a series wihidi 
divides the series into two equal parts> one part consisting of 
all values less, and the other all values greater than it. 

That is, if a series is arrayed, or which comes to the same thing, 
the values of its items are placed side by side in ascending or 
decending order of their magnitude, the value of the middle 
item of the array is the median. If the students of a class, 43 
in number, be asked to stand in order of their height, the 22nd 
student from either side shall be the one whose height will be 
called the median height of the class. This method of picking 
up the median item can be symbolically expressed as follows: 

A/ --Size of ^ " — 

Where M represents the median®, and n the number of 
items. 

•Median is the size or value of the middle item, and not the rtmk or 
the mimher of such item. 

In onr further calculations M will represent the median. 



118 statistics: theory and practice 

Deteiinination of the Median: Individual measuremeints : — 

Example 3. Required to find the median in a series of 
quantitative individual observations, relating to the monthly 
expenditure incurred by 35 students in a boarding house. 

The given figures are first arrayed as follows : 


Table 7. Monthly Expenditure of 35 studenJs arranged in 
Ascending order of Magnitude, 


Serial 

No. 

Expendi- 

ture. 

Serial 

No. 

Expendi- 

ture. 

Serial 

No. 

Expendi- 

ture. 

Serial 

No. 

Expendi- 

ture. 

Serial 

No. 

Expendi- 

ture. 


Ks. 


Kb. 


Bs. 


Rs. 


Rs. 

1 

35 

8 

40 

15 

I 45 

oo 

4(5 

29 

50 

2 

35 

9 

41 

16 

45 

23 

47 

30 

50 

3 

36 

10 

42 

17 

: 45 

24 1 

47 

31 

52 

4 

38 

11 

42 

18 

45 

25 I 

1 

47 

32 

52 

5 

38 

12 

44 

19 

45 

26 

48 

33 

; 54 

6 ' 

40 

13 

44 

20 

46 

27 

48 

34 

^ 55 

7 

40 

14 

44 

21 

46 

28 

48 

35 

' 60 


Applying the above formula we have, 


Af=Size of 
= Size of 



a, in this case, equals 35, 

^^iteni. 


= Rs. 45. 


The number of items in the above example was odd and, 
therefore, there was no difficulty in locating the middle item 
(18th in this case). But, the number may be even. In such 








STATISTICAL AVERAGES 


119 


a case, the median is intermediate between the values of the 
two middle items. Supposing, in the above example, the 19th 
student ^s expenditure was Rs. 46 and an additional, 36th, 
student ^s expenditure was Rs. 61, then 


^-Size of I — “ ^ item ; n, in this case equals 36. 

V 2 / 

c. r 

— Size of I 2 Item, 


Size of 1 item + Size of 19^^* item* 


=^Rs. 45-8-0 


2 


Discrete Series: — 

In a discrete 
be the median. 

Example 4. Required to locate the median of the data 
given in table 5, 


series also the siz«: 


izo of ^ y 


item shall 


Table J8. Cumulative Frequency Table, 


Size of item 

Frequency 

1 Cumulative 
' Frequency 
cf 

m 1 

i 

/ 

4 

! 2 

2 

5 i 

! ■ 5 

7 

6 ' 

; 8 

15 

7 ! 

9 

24 

8 

12 

36 

9 

14 

50 

10 

14 

64* 

11 

15 

79 

12 

11 

90 

13 1 

13 

103 

14 i 

9 

112 

15 i 

i ■ 

119 

16 

j. 

i 123 

IT 

1 3 

! 126 




120 


statistics: theory and practice 


A/=The size of item; n equals 126, 

( 126+1 \ 

2 — y ^ item, i.e., 63.5^^ item. 

== 10 . 

[It should be noted that in this series the size of all items 
beyond the 50th and up to the 64th is 10.] 


In the case of continuous frequency distribution, however, 
the median will have to be interpolated in the class containing 
the median, if the original data are not available. Interpolation 
gives only an approximate value. It is done on the .assump- 
tion that the size of items in the median class is uniformly 
spread over its frequency. 


Continuoiis Series: — 

Example 5. Required to locate the median in the continu- 
ous frequency distribution given in table 1. 

Table 9. Cumulative Frequency of marks of 60 Students in 
Economics. 


j 

Marks-Group 

Frequency 

Cumulative 

Frequency 

0—5 

4 

4 

5—10 ; 

G 

10 

10—15 1 

10 

20 

15—20 

16 i 

36 

20--25 

12 1 

48 

25—30 

8 i 

56 

30—35 

4 ! 

60- 


M=Size of 



= Size of 




item = Size of 30.5^^ item. 





STATISTICAL AVERAGES 


121 


If we had the original data with us the size of 30.5th item 
could have been directly determined. But since we don't have 
it, we can only estirmUe the median. 30.5th item is situated in 
the 15 — 20 marks group. This group's frequency is 16 and 
magnitude 5. It is assumed that these 5 marks are evenly 
distributed over the 16 students. The 20th student gets 
approximate^ly 15 marks. Therefore, the 30.5th student shall 

get (30.5 — ^20) or about 3.28 marks more than the 20th 

student. Thus, the size of 30.5th item in our series, that is 
the median, is 18.28 marks. 

The above calculation can also be symbolically expressed 
as : 

= X (m “~c) ^ 

where M represents median, / lower limit of the gioup in 
which median is situated, i the magnitude of the class interval, 
/ the frequency of the class-interval, rn the number of middle 

th item, and c the cumulative frequency of the 

group lower than the one in which median is situated. 

Applying the above formula we have, 

X (30.5-20) I 
~ 18.28 marks. 

The assumption made above would have been still more 
clear, were the class-intervals arranged according to the 
inclusive method, and not according to the exclusive one 
followed in the arrangement in the above example or in table 1. 
We take such an example below: 

’ Formulae slightly different from this are also given bv certain 
authors. We, however, feel that this formula is satisfactory, aa it is in 
keeping with the assumption we have made for interjwlation. 


item or 





122 


statistics: theory and practice 


Example 6. Required to determine the median. 

Table 10. Cumulative Frequency of marks of 60 Students in 
Economics, 


Marks-droiip 

Frequency 

Cumulative 

Frequency 

1- 

4 

4 

6—10 

6 

10 

11—15 

10 

20 

16—20 

16 

36 

21—25 

12 

48 

26— :J0 

8 

56 

31-35 

4 

30 


Here, as in the former example, the middle item is 30.5th 
which lies in the group (16 — 20) marks, whose magnitude is 
5 and frequency 16. Also, the 20th student gets approximately 
15 marks, so that, 

M = 15+ I (30.5-20) j- =18.28 marks. 

Advantages of the Median. 

1. It is easily understood. 

2. It eliminates the effect of extreme (and therefore 
abnormal) vai'iations. 

3. It can be determined without a knowledge of the 
magnitude of extreme items, provided the number of items is 
known. 

4. It is usually, e.g. when found exactly, an actual 
example from the data. 

5. It can be located by inspection in certain cases. 

6. It can be exactly located. 

7. It is specially useful for considering data, the items 
of which are incapable of being quantitatively measured. 
A group of students may be made to stand in order of their 




STATISTICAL AVERAGES 


123 


intelligence. The middle student shall represent the median 
intelligence. Median can, therefore, be employed to serve as 
an average, yielding a sufficiently reliable representative, in 
an estimate of qualities like honesty, health, virtue which 
cannot possibly be expressed in specific units. 

Disadvantages of the Median. 

1. The fact whether the median is representative of the 
variable depends upon the nature of the distribution of the 
values. It may not be representative when the distribution is 
irregular, i.e. the items vary greatly in magnitude. Let the. 
runs made by the players of a cricket team be 0, 1, 3, 6, 9, 12, 
48, 60, 60, 60, 98. The median runs, 12, are made only by one 
player, while 60 are scored by three players. Median is not 
a typical representative in this case. 

2. It cannot be precisely determined when it falls 
between two values. Then, it can only be estimated. When 
estimated, it may be a value not found in the series. If in a 
class of 40 students, 20 secure marks varying between 10 and 
20, and another 20 secure marks varying from 25 to 35, the 
median marks would be indeterminate in this case. They 
would be assumed to fall between 20 and 25, which are not 
obtained by any of the 40 students. The median marks would 
give a fictitious number. 

3. It is not capable of being located by any simple 
mathematical process. 

4. It is not useful in those cases where large weight is to 
be given to extreme items, for it treats all frequences alike. 

5. It is unsuitable for arithmetic or algebraic mani- 
pulation. 

6. The aggregate value of items cannot be obtained when 
the median and the number of items are known. 

7. It requires the data to be arrayed before it can be 
determined — an operation which involves considerable work* 



124 


statistics: theory and practice 


Uses of tbe Mediaai. 

Median is easy to understand and is, therefore, useful for 
practical purposes. It is not only useful for the study of 
problems whose objects are not quantitatively measura))le, but 
is also valuable in comparinf? such data as are difficult to 
measure individually and have to be grouped within certain 
limits. It is, therefore, of immense use in considering social 
phenomena like wages, distribution of wealth, skill, etc. 
The median, however, is not very suitable for being used in 
commerce, because commercial data are very widely dispei\se<l 
i.e. they are not highly regular in disti'ibution. Where such 
is the case, we have seen, median is not a good representative. 
What the businessman has in mind is usually the mode. 

QUARTILES, DECILES & PERCENTILES 

The principle according to which median is determineii 
can be extended to divide a series into any number of parts. 
The values of the items dividing a series into four equal parts 
are ealled Quaxtiles. When a series is arrayed and the 
median divides it into two halves each of the lower and the 
upper halves can also be divided into two equal parts. The 
value of the item dividing the lower half is called the First 
Quartile or the Lower Quartile represented by and the 
value of the item dividing the upper half is called the Third 
Quartile or the Upper Quartile represented by median being 
the Second Quartile. A series is, thus, divided -into four equal 
parts at the first and third quartiles, and the median. 

Similarly, a series may be divided into ten equal parts. 
In doing so, we shall get nine dividing positions, the values of 
which are called Deciles. We have, thus, nine deciles in a 
series, the fifth decile being the median. 

Again, a series may be divided into 100 equal parts, giving 
ninety-nine dividing positions, the values of which are called 
Percentiles. There are, thus, 99 percentiles in a series, the 
fiftieth percentile is the fifth decile, or the median. 



STATISTICAL AVERAGES 


125 


In similar manner we can have Quintiles and Octiles. 


Location of Qtiartiles, Deciles and Percentiles. 

The principle of locating the median, is the principle 
followed here also. The given series is first arrayed. Then, 
if the series is composed of quantitative individual observations 
or is a discrete one the following formulae shall apply : — 


^i = The Size 
^3 = The Size 
Di=The Size 
©2 = The Size 
/^ = The Size 
p 2 — The Size 


/3(/i+l) 

off ^ ^*‘item. 


) V*‘item; similarly, for the rest 

""V l o ■ / 


of 

of 


V 100 / 
/2(« + l)V 
\ 100 ) 


of the deciles, 
item. 


^^^item; similarly for other 
percentiles. 


Where, stands for first quartile®, for third quartile®, 
Di for first decile, D 2 for second decile, I\ for first percentile, P 2 
for second percentile, and n for number of observations. 

Example 7.^ Thus, in the series given in table 7, (Individual 
observations). 




= The Size of 
= The Size of 



•'item; n equals 35, 
’‘item, 


= Rs. 41. 


*, ®. In our further calculations and (?, shall stand for the first and 
the third quartiles respectively. 

Quartiles as also deciles and percentiles refer to the wee of item and 
npt to the rank. 



126 


statistics: theory and practice 


^3 = The Size n equals 35, 


==The Size of 27 item, 
= Rs. 48. 


Z )4 = The Size of 




*^^'item; n equals 35, 


= The Size of 14.4^‘‘ item = The Size of 14'^*' item+ 


10 

(Size of 15^^^ item — Size of 14 item) 




] 


= Rs. I 44+ -(45-44) | =Rs. 44.4. 

^ ^ /90(n+l)\ 

Pgo^The Size ofl ^^item; n equals 35. 

=The Size of 32.4*^‘‘ item, 

4 

= The Size of 32"'* item + — (Size )f 33 '4tem — Size of 32 "4tem) 


[= 


= Rs. 1 52+ 1^(54-52) | ^^Rs. 52.8 


1 


Similarly, in the data given in the table 8, (discrete series), 

( 126+ 1\ 

— ^4 / * item. 


^3 = The Size of 
= 13 


( 


3 ( 126 + 1 )\,, , 




item, i.e. 95.25 item. 


The various deciles and percentiles can also be determined 
in like manner. 

If the data arei grouped into certain defined limits, 
quartiles, deciles and percentiles shall be located by inters 



STATISTICAI- AVERAGES 


127 


polation, which yields approximate values. The formulae to be 
used shall be : 



= {yXiq -c) J 

Where, qi and stand for first and third quartile numbers 
respectively, and other symbols for what they did in inter- 
polating the median except that the class intervals referred to 
shall be those relating to Qi and ^3 and not to median. 
Similarly, formulae for interpolating deciles and percentiles 
can be framed. 

Thus, in the continuous frequency distribution given in 
table 9, 

item ==15.25*^^ item. 

item = 45.75 item. 

Therefore, ^, = 10+ | X ( 15.25 - 10 ) | = 12.625 marks, 

and (>, = 20 + I X (45.75 - 36 ) j- = 24.0625 marks. 

Characteristics and Uses of Qmrtilies, Deciles and Peraentiles. 

The quartiles, deciles and percentiles are not averages in 
the sense median is, since they refer not to the whole variable 
but only to parts of it. Of course, for determining them the 
part to which they relate is treated as the whole series. Thus, 
quartiles are, in a sense, equivalent to the medians of the lower 
and the upper halves of a series, but they cannot be considered 
as averages of the first ordcr^ i.e., as sizes which can be taken 
as types or substitutes of the whole series. 

Yet^ quartiles etc. give a valuable information regarding 
the series. They indicate the distance within which certain 


( 3(n+l) 


( «+i 
4 



128 


statistics: theory and practice 


parts of the series lie. Thus, knowing the quartiles of the data 
given in table 7, we may say that the middle half of the series 
lies between Rs. 41 and Rs. 48. Deciles and percentiles can 
also similarly yield the information characteristic of them. 
We shall refer to the importance of quartiles again, while 
considering the manner in which items in a series are distri- 
buted.^^ 


THE ARITHMETIC AVERAGE 

Tbje arittimetic average, also called the arithmetic mean, is 
the quantity obtained by dividing the sum of the values of the 
items in a variable by their number. Thus, it is the average 
of common speech, an average quite familiar to the, layman. 
Two types of arithmetic averages may be distinguished; 

1. Simple Average, in which all items are treated alike i.e. 
each item is considered only once. 

2. Weighted Average, in which all items are not treated 
alike, each item being assigned a weight in proportion to its 
importance in the series. 

The Simple Arithmetic Average ; its determination. 

The sum of the values of all the items in a series is called 

the Aggregate or Summation of Measurements. Summation is 

denoted by the greek letter 2 (capital sigma). Then, 

Xm 

a— 

n 

where, a represents arithmetic average”, %m represents sum- 
mation of measurements, and n represents the number of items. 

Series of Individual measurements: — 

Example 7. Required to find the simple arithmetic 

average of the data given in table 7. 

ir, 1580 „ ,, ,, 

a=Rs.'^ — = Rs. 45.14. 


See Chapter XI. 

In all our further calculations a will stand for the ajithmetic 
average. , 



STATISTICAL AV^AOES 


129 : 


It is. not nonessany that the values of the items should 
first be arrayed as they are done in table 7. The original 
data, recorded as they occurred, could have been equally well 
utilized to find the aggregate. 

The above example illustrates the direct metbod of com- 
puting the simple arithmetic mean. It involves considerable 
work of addition when the series is large and digits in each 
number are several. To save time and labour, the sbort-cut 
method can be utilized if the values of the different items 
happen to be nearly the same. To use this method, any size 
of item may be assumed as the average. Deviations of the 
value of each item from this assumed average should then be ’ 
found and put down with proper sign. The algebraic sum of 
these deviations should be found out. Then, 

n 

where x is the assumed average, and Srfx the summation of 
deviations from the assumed mean. 

Example 8. Required to find the simple arithmetic mean 
of the given data by the short-cut method. ' 

Table 11. Short-ciu Method of Computing the Arithmetic Average^ 


Size of items 

m 

Deviations from 
assumed average 
(1362) 
dx 

1365 

+3 

1360 

-2 

1358 

-4 

1362 

0 

1370 

+8 

1363 

+1 

1368 

+6 

1364 

+2 , . 

1371 

+9 

1362 

0 


,Sdx=+23’ ' 


• F.— 9 




130 


statistics: THimr and pbacitce 


Deviations in the seeorid column above have been found by 
the simple formula, 

== (Size of item— Assumed Average) 

= im-x) 

where, m, is the size of item and x the assumed average, 
rfxis positive or negative according as m is greater or smaller 
than X, 

The algebraic sum of the deviations in table 11 is+23. 
Then, according to the formula we have, 


.- 1362=1 

/. a = 1364.3. 

The above short-cut method is based on the simple fact 
that the algebraic sum of the deviations of individual values 
from the arithmetic average is equal to zero. Thus, deviations 
of the size of items from 1364.3 in the above example are 
r6si)eetively, 

+.7,-4.3,-6.3,-2.3,-f5.7,~1.3,+3.7,~3,+6.7,-2.3. 
Their summation is zero. 


Discrete Series: — 

In a series of discrete type each size of item should first 
be multiplied by its frequency, and the product summated and 
divided by the total frequencies. The quotient would give the 
simple arithmetic average according to direct method. 

For the same reason as we did in example 8 above, the 
short-cut method can also be employed in a discrete series. 
First, deviations of each size of item from the assumed average 
should be found out as in the above example. Each deviation 
should be multiplied by its frequency and algebraic sum of the 
products obtained. Then, the same short-cut formula shall 
p yield the required average, Plxample 9 shall demonstrate the 
working of these two methods in a discrete series. 

Example 9. Required to find/ the simple arithmetic 
average of the data given in table 8 by the direct and the 
short-eut methods. 

Let X, assumed average, for the latter method be 10. 



STATI8TICAI. AYEEAG£S 


131 


Table 12. Ccdculation of Simple Arithmetic avemge by the 
Direct and the Short-cut Methods. 


a 

b 

c 

d 

e 

Size of 
items 

m 

Frequency 

/ 

Total size 
of items 
(col. aX 
col. b) 

mf 

Deviations 

from 

assumed 

mean 

(10) 

dx 

Total 
Deviations 
(coL bX 
col. d) 

fd^ 

4 

2 

8 

-6 

-12 

5 

5 

25 

-5 

-25 

6 

8 

48 

~4 

-32 

7 

9 

63 

-3 

-27 

8 

12 

% 

-2 

-24 

9 

14 

126 

-1 


10 

14 1 

140 

0 

0 

11 

15 

165 

+ 1 

. 1+15 

12 

11 

132 

+2 

+22 

13 

13 

169 

+3 

+39 

14 

9 

126 

• +4 

+ 36 

15 

7 

105 

+5 

+35 • 

16 

4 

64 

-t-6 

+24 

17 

3 

51 

+ 7 

+21 


«=126 

2m=1318 


Sdx=+58 


In computing a by the direct method we shall be concerned 
with columns (a), (b) and (c) ; while in calculating it by the 
short-cut method we shall be concerned with columns (a), (b), 
(d) and (e). 


d 


Direct method: 



1318 

126 


=10.46. 



.". a 


= 10 + 


126 


=10.46. 


Short-cut method: 




statistics: THteORY AND PRACTICE 

Cimtiitiiioiis Series:*— 

When frequency distribution of a continuous type is given 
arithmetic average can only be calculated on the assumption 
that the values of all the items in each class are identical with 
the mid-value of the class-intervah Both the direct and the 
short-'CUt methods can be followed in this case ; and, after find- 
ing out the. mid-values of the class-intervals the procedure of 
calculating the arithmetic average would be the same as in the 
case of discrete series. 

Example 10. Required to find the simple arithmetic 
average of the data given in table 1 by the direct and the 
short-cut methods. 

In table 13, we shall not be concerned with columns (e) 
and (f) for the direct method, and with column (d) for the 
short-cut method. 

Table 13. Calculation of Simple Arithmetic Average of Marks 
of 60 Students by the Direct and the Short-cut 
Methods. 


a 

b 

c 

d 

1 _ e 

f 

Marka- 

gioup. 

Mid- 

value 

m 

a 

O) 

1 

f 

Total value 
of items 
(col. bXcol. 

c) 

mf 

Deviations 
from assum- 
mean 
(17.5) ^ 

Total devia- 
tions (col. 
eXcol. c) 

fdx 

0— 5 

2.5 

4 

10 

— 15 

— 60 

5—10 

7.5 

6 

45 

— 10 

— 60 

10—15 

12.5 

10 

125 

— 5 

-r-50 

15—20 . 

17.5 

16 

280 

0 

0 

20—25 

22.5 

12 

270 

+ 5 

+ 60 

25—30 

27.5 

8 

220 

-f 10 

+ 80 

30—35 

32.5 

4 

130 

+ 15 

+ 60 



n=60 

2m=:1080 


X = + 30 




STATIST ICAt AVEEAGES 


133 



Xm 

1080 _ 

Direct method: 


— — = 18 marks. 

n 

60 

Short-cut method: 

Si* 

a—x= 

n 

30 

• = 

/. o =17.5+ 

— = 18 marks, 
ou 


In the foregoing examples on simple arithmetic average 
both the direct and the short-cut methods have been demons- 
trated. It will be seen that the answers in each case by 
both the methods are exactly the same. The saving in labour 
is quite obvious. In the above examples, it is not necessary 
to use as assumed averages the values we have chosen for 
the purpose. Other values may also be so used. The answer^t 
will not be different. 

Advantages of Simple Arithmetic Average. 

1. It is easily understood and has a general usage. 

2. It is easy to calculate. Its calculation is a common 
knowledge. 

3. It utilizes all the data in the group. 

4. It does not necessitate the arraying of data as the 
median does, nor the grouping of data as the mode does. 

5. It can be known even when number of items and their 
aggregate values are known, and details of the different items 
are not available. 

6. It is determinate. It is not indefinite. 

7. The aggregate can be calculated if the number of 
items and the average are known. 

8. It affords a good standard of comparison, since the 
ataormalities in opposite directions tend to cancel each other 
if the number of items is sufficiently large. 



134 


statistics: theory and practice 


9. It is am«nable to algebraic and arithmetic mani- 
pulation. 

For all these qualities it is the most widely used average. 

Disadvanta^ of Simple Arithmetic Avenge. 

1. It may give considerable weight to extreme (and there- 
fore abnormal) items. A millionaire would greatly affect 
the average income of a town where a majority consists 
of ill-paid artisans. 

2. It can hardly be located by inspection ; mode and 
median can be. 

3. It can ignore any single item only at the risk of losing 
its accuracy. Mode and median ean be computed even when 
the values of extremes are not known. 

4. The average that results may not occur in the data at 
all, and may not therefore be representative to the fullest 
degree. The average of 2, 4 and 9 is 5, which does not occur 
in the scries. 

5. It cannot be used when the data are incommensurable : 
median can be used in qualitative studies. 

6. This average might lead to fallacious conclusions 
when the actual figures from which it is obtained are not 
given. For instance, two students, A and B, get the following 
marks : — 



A 

B 

First Terminal Exam. 

40% 

60% 

Second Terminal Exam. 

50% 

50% 

Annual Examination 

60% 

40% 


The average percentages of both of them are identical, 50. 
But A’s progress is positive while B's negative. If the average, 
50, is not supported by the percentage marks in the three 
examinations, the fact that A is progressing while B is deterio- 
rating would be concealed and a fallacious conclusion that the 
standard of both of them is the same would be drawn. 



STAT2STICA1< AV£BA(S£8 185 

Uses of Sunple Arithmetic Aven^ 

It is used in many social and economic studies. Its use 
is daily routine in business and commerce. It is an average 
which even a ‘ man in the street ’ understands. It is the 
common average. Statistics uses it not only as a type for com- 
parison, but for several other statistical calculations as well. 
“Average output of a commodity,” “Average imports or 
exports over a period,” “ Average cost of production ” 

“ Average price ” — ^in all such expressions the average used 
is the arithmetic average. 

Weighted Average. 

In computing simple arithmetic average it was assumed 
that all items were of equal importance. This may not always 
be the case. Where items vary in importance they must be 
assigned weights in proportion to their relative importance. 
The value of each item is then multiplied by its weight, pro- ^ 
ducts summated and divided by the number of weights and 
not by the number of items. The quotient is the weighted 
arithmetic average. 

Weight is thus a number which stands for ^ relative 
importance of itmns. This relative importance may be real 
or estimated. Consequ^tly weights are actual or approxi- 
mated. Actual weights should be used where they are avail- 
able; otherwise, they may be estimated on the strength of 
the best possible data available. For instance, if we know the 
actual number of people engaged on the teaching, clerical and 
menial staff of an institution and the average earnings of each 
class of employees, we should multiply the average earnings 
of each class by the actual employees in the corresponding 
class, summate the products and divide the sum by the number 
of employees to secure the weighted arithmetic average of 
earnings. The actual number of employees shall constitute 
actual weights. The full method of working it out is shown 
in the following example. 



statistics: theory and practice 


id6 

Example Required i6 compute weighted arithmetic 


average. 

Table 14. Calculation of Weighted Average Earning^ of the 
Employees of X College. 


1 

Deseriptioa of the 

employees 

(1) 

§ 

•gs 

c 

1 

1 

(2) 

bo 

« 

1 > as 

< bf) 
e 

1 

(d 

o 

s 

(3) 

CD 

L 

O 

'g’S 

(M 
ro w 

O 

& 

(4) 

Estimated AVeights 

'w' 

Product of columns 
-2 (3) and (5) 


1 

1 

Rs. 

Bs. 


Rs. 

Professors 

2 

600 

1,200 

1 

600 

Lecturers 

16 

200 

3,200 

8 

1,600 

Demonstrators 

4 

100 

400 

o 

200 

Clerks . . . . 

2 

60 

120 

1 

60 

Peons . . . . I 

7 

15 

105 

4 

60 

AVatchmen 

3 

14 

42 

1 

14 

Totals 

34 

989 

1 

5,067 j 

17 

2,534 


Weighted arithmetic average — 


(a) by using actual weights = Rs. =Rs. 149-0-6 

2534 

(b) by using estimated weights =Rs.----=Rs. 149-0-11 

Along with demonstrating the method of computing the 
weighted average, the above example also shows that it is not 
necessary that the weights applied shoqld be actual ones, (as 



STATISnCAI. AVEBAQES 


137 


in column 2) ; they may be approximate also (as in column 5), 
the difference between the results obtained by using the actual 
and estimated weights 'being only five pies, which is not 
material. It should, however, be noted that when the number 
of weights used is small, their size may have a considerable 
effect upon the average, and therefore, if estimated iveights are 
used they should be approximately correct. If many weights are 
used, the error in their estimation will be mostly unbiassed and, 
therefore, cancelling one another. The average would not, 
then, be materially affected. 


W^hen we desire to calculate the average earnings 
X)er employee, they might, at first sight, appear to be 


600+200+100+60+15+14 

6 


or Bs. 


164.6. 


If it were so, the 


total monthly earnings would be Rs. (164.6X34) or Es. 5596.4; 
but, in fact, the monthly pay roll amounts to : — 


(Rs. 600X2) + (Rs. 200X16) + (Rs. 100X4) + (Bs. 60X2) 
+ (Rs. 15X7),+ (Rs. 14X3), that is, Rs. 5067 only. Since 
Bs. 164.6 multiplied by the number of employees do not yield 
the aggregate, the monthly pay roll, it is not a correct average. 
If, however, we multiply the weighted average, Rs. 149-0-6, by 
the number of employeejs, 34, we get Bs. 5067, the monthly 
pay roll. The weighted average, in this case, is approximately 
the same as the simple arithmetic average of the. total earnings 
of all the 34 employees would be. 

In the above example we have multiplied the size of items 
(monthly average earnings) by their corresponding frequencies 
(number of employees). This has been called weighting by 
some writers^^ on statistics, while it appears to be another 
device for computing the simple arithmetic average. A few 
writers do not regard it as weighting and their view is justified. 
Hfluace Secri&t, for instance, is of opinion that weights should 
be 'determined by some evidence of importance other than that 


Notably;, King, Boddington, Connor. 



138 


statistics: theoey and practice 


associated with the items themselves*.^®. Similarly, to Ij^elley 
weights are * determined not at all, or not solely, by the 
population, but from other evidences of impoi-tance V* Thus, 


Type of Employee 

Number 

Employed 

Relative 

Productivity 

Productivity 

X 

Number 

Male Adult 

18 

1 

18 

Female Adult 

8 

i 

6 

Children 

4 

1 ^ 

2 


Therefore, men-equivalents=26 


Similarly, a teacher may assign weights to different grades 
of work done by the students in proportion to the importance 
of the grades. He may, for example, assign 4 to seminar work, 
2 to class-room work and 3 to monthly test. Marks obtained 
out of, say, 100 in each grade will be multiplied by the weight 
of the grade and the sum of the products divided by 9. This 
average will not correspond closely to the simple average. 

In fact, both the systems, — allotting weights according 
to actual number and according to estimates of relative im- 
portance — are in vogue. We shall read more of them while 
discussing Index Numbers. 

Wlxen Should Weighted Average be Used? 

(1) When the items falling into different grades or classes 
of the same group show Oonsiderahle variation, and it is desired to 
obtain an average representative of the whole group^ weighted 
average is the only proper average to be used. Of course, if details 
of the different grades are available, simple arithmetic average 

Horace Secrist, Jn Introduction to Statistical Methods, New York, 
1933, p. 280. 

Kelley, T. L., Statistical Method, New York, 1923, p. 68. 








STATISTICAL ATESAGES 


139 


will be quite sufficient. Thus in our example of the earnings 
of X college, the simple (unweighted) average appeared as 
Rs. 164.6, but it was not representative of the data. The 
weighted average, Rs. 149-0-6, was a better representative. If, 
however, we knew the earnings of each individual employee, ad- 
ded them up and then computed their simple arithmetic average, 
it would also have been an equally good representative. It is 
usually found in a study of wages that the number of workmen 
earning high wage is much less than the number of those get- 
ting low wages. If, then, a simple arithmetic average of the 
wages in all the occupations — treating all grades as of equal 
importance — were computed, the wage of the manager would 
be given as much weight as that of a coolie or a gangman and 
the average would appear considerably large. Weights cannot 
be ignored in such cases. But, it need not be forgotten that 
proper weighting is as valuable as wrong, manipulated or 
erroneous weighting is dangerous. Weights should, therefore, 
be as approximately accurate as possible. We take below an 
example to demonstrate the argument. 

Example 12. Required to compute weighted arithmetic 
average. 

Table 15. Calculation of W^eighted Average of the Percentage 
Success in X Si Y Universities, 


University 

Examination 

Relative pro 
portion of 
candidates 

« i s 
g>il 

■s s a 

S M 

fe'g.a-B 

o 

CQ 

O S CO 

0 a 

O o PS 

^ C8 

i k 
1^1 

® O .|H 

Product of 
columns 2 
and 5 

Arbitrary 

weights 

Product of 
columns 3 
and 7 

Product of 
columns 5 
and 7 

1 

2 

3 

4 

5 

6 

7 

8 

9 

M.A. 


80 


75 

750 

15 

1200 

1125 

M. Sc. 

7 

i 65 

455 


350 

10 

650 

500 

B.A. 

BTiB 


Kliim 


2800 

10 

600 

700 

B. Sc. 

25 

55 

1375 

75 

1875 

5 

275 

375 

B. Com. 

13 

75 

975 

65 

845 

40 

3000 

2600 

1 

Totals 

j 

95 

335 


335 

6620 

80 

5725 

5300 






140 


statistics: theory and practice 


I Simple average of pcrcwjU^e success in — 

335 

X University =-g —67 

Y University =^^-=67 

II Weighted average, by using the weights in column 2, in — 

6005 

X University=-^g-=63.2 
6620 

Y University =—^g-= 69.7 

III Weighted average, by using the weights in column 7, in — 

5725 

X Unxversity= - ^ =71.6 
ot) 

5300 

Y University= ^^ = 66.3. 


The above example makes the following facts clear: - 

(i) The simple averages for both the universities are 

identical. If they are used for comparing the 
percentage success, it would appear that the 
average percentage success in both of tliem is 
the same. 

(ii) But when weights are assigned to the results in 

proportion of the number of candidates at 
different examinations, Y university appears to 
have much better result than X. This conclusion 
is corroborated by the fact that in the B.A. and 
B.Sc. examinations, where the number of candi- 
dates is very large, the percentage success in Y 
university is higher — much higher in B.Sc. — than 
in X. These weighted averages, therefore, yield 
a proper comparison. 



STATISTICAL AVEHAGESi 


141 


(iii) When weights given in column 7, chosen quite 

arbitrarily without regard to the proportionate 
importance of the items, are used, the conclusion 
arrived at in (ii) above is reversed: X university 
appears to indicate better results than Y. This 
inference is not corroborated by facts. There- 
fore, allotting of weights needs great care and 
caution lest fallacious conclusions result. 

(iv) A comparison of the averages arrived at by using 

the weights given in columns (2) and (7) reveals 
that the averages for X university in U and III 
cases, 63.2 and 71.6, and also the averages for Y 
in the two cases, 69.7 and 66.3, are very much 
different between themselves. It is so, because 
the weights in column (7) do not have between 
one another the same proportion as weights in 
column (2) do. In our example of the earnings 
of the employees of X college, table 14, the weights 
in column (5) have the same proportion among 
themselves as those in column (2) do. The 
averages resulting from the use of the two weights 
are, therefore, not materially different. It is thus 
established that it is not the absolute size of the 
weights but their relative size that vs important. A 
weighted average is unaltered if all the weights are 
multiplied or divided by the same quantity, that is, if 
their mutual proportions remain unchanged. 
Even if the multiplication or division is not 
accurate but only approximate, as in column (5), 
table 14, the resulting weighted average shall be 
approximately correct. 

It should, however, be pointed out that the weighted 
average refers to the whole group. It, therefore, does not 
represent the actual conditions of any one sub-division or 
grade or class of the group. It does not represent any indivi- 



142 


statistics: the(»y axd pkactice 


dual of a grade. It is useful only for general comparison. 
It is a good average to use when * the group as a whole, say 
the whole of an industry, is surveyed. If we wish to study 
the actual condition of the various sub-divisions or classes of 
the group we should compute the averages of the different 
classes separatdy and compare such averages. That is, we 
should study each homogeneous part separately. 

( 2 ) Weighted average should also be used when the size 
of items changes and the relative proportion of the number 
of items also changes. For example, if in example 11, the 
earnings of employees changed, the average earnings would 
also be changed. Or, if the number of men employed were 
altered, the old average will hot nece^tily stand. This will 
happen, only when the proportions are also changed. If, for 
instance, the number of employees in each grade is doubled, 
the proportion would not change and so the average would 
remain the same. 

The weighted average, generally speaking possesses the 
same advantages and disadvantages, and has almost the same 
uses as the simple arithmetic average. It is invariably applied 
in the calculation of birth, marriage and death rates, and their 
comparison in different places or at different times. Weight- 
ing is essential for attaining accuracy in the result when the 
series is small. In very long series weighted average and 
simple arithmetic average tend to be identical. So, weighting 
is not very necessary in a long series. 

THE GEOHETRIC AVERAGE 

The gaoBietric average, alec ^called tibe geosnetric mean, is 
thentli root of the product of the ii quantities of a series. The 

geometric mean is obtained by multiplying the values of the 
items together and extracting the root of the product corres- 
ponding to the number of items. Thus, the square root of the 
product of two quantities is their geometric mean. Similarly 



143 


STATISTICidL AYESiAGm 


the cube root of^ the product of three quantities, is the geo- 
metric mean of three quantities. Symbolically, 

g=y,flX6XcX n 

where g stands for the geometric mean, n for the number of 

items and a, b,c for the values of n items. The geometric 

mean of 4 and 9 is equal to *y4X9=6; the geonptetric. mean 
of 2, 4 and 8 is equal to ®\/2X4X8=4. When the number of 
items in a series is larger than three, this process is iJifBcult to 
follow. To obviate the difficulty, logarithm of each size is 
obtained from a Mathematical Table.'^ The logarithms of all 
the values are added up and divided by the number" of items. 
The anti-logarithm of the quotient is the required geometric 
mean. The formula is: 


g=Anti- 
Example 13. 


log 64- log cH- . . , . . .2J 

Required to calculate the geometric average. 


Table 16. Calculation 

of G^om^tric average. 

Size of Items 

Logarithms 

4.5 

.6532 

250.0 

2,3979 

12.0 

1.0792 

119.5 

2.0792 

30.0 

1.4771 

42.0 

1.6232 

75.0 

1.8751 

35.4 

1.5490 

Es. 568.4 

12.7339 



”, Mathematical Tables are given at the end of the- book. 





144 


statistics: theory and practice 


According to the formula we have, 

, , 12.7339 ^ ^ 

g=AnU-log — = Anti-log 1.6, 

, =Rs. 39.81 . 

The geometric mean is always less than the simple arith- 
metic average, unless all the sizes of the variable are equal 
in magnitude. Thus in the above example, 

568.4 

a = Rs.^ — ;^=Rs. 71.05, which is greater tlian 

o 

the geometric mean. 


Weighted Geometric Mean. 

To compute the weighted geometric mean of a series of 
items, each individual item should first be multiplied by its 
corresponding weight and then the products obtained should 
be multiplied by one another. The nth root of this final pro- 
duct, where n stands for the total number of weights, is the 
required weighted geometric average. Symbolically, 


^ ^ ^ ^ n"’" 

where Wi, W2 represent the weights corresponding to 

the size of item to which they relate. 

In practice logarithms may be used. First, logarithm of 
each individual item should be found from a mathematical 
table. Each log, should then be multiplied by its weight. The 
summation of such products divided by the total number of 
weights is the required weighted geometric mean. This may 
be expressed as follows : 


g=Anti-log. 


r(log aXu?i) + (log 6 Xu? 2 ) + 

. . (log nXu!n )| 

1. i4;i + tt; 2 + Wn ' 

J 



STATISTICAL AVERAGES 


145 


Example 14. Required to compute weighted geometric 
mean. 

Table 17, ''Capital*" Index of Indian Industrial Activity^ 
Mar chi 1942. 

Calculation of FUwl Index Based on Geometric Mean. 


Items 

Weights 

Index No. 
(1935= 
100) 

Log. 

Weight X 
Log. 

Indian Cotton Con- 
sumption 

9 

149.5 

2.1761 

19.5849 

Jute manufactures . . 

6 

134.9 1 

2.1303 

12.7818 

Steel Ingots 

5 

147.1 

2.1673 

10.8365 

Pig Iron 

8 

134.4 

2.1271 

17.0168 

Paper . . 

3 

185.2 

2.2672 

6.8016 

Coal 

7 

110.0 

2.0414 

14.2898 

Rail & River borne 
Trade 

24 

108.9 

2.0374 

48.8976 

Cheque Clearances . . 

20 

88.5 

1.9469 

38.9380 

Notes in Circulation* 

6 

132.4 

2.1206 

12.7236 

Consumption of Elec- 
tricity 

7 

152.9 

2.1847 

15.2929 


95 

• 


197.1635 


According to the formula we have, 

197 1635 

g=Anti-log. — — =Anti-log. 2.0754 
= 118.9 

The “ Capital ” Index of Indian Industrial Activity for 
March 1942, therefore, is 118.9. 

Advanta^ of the . Geometric Mean. 

1. It is determinate, provided the values of the variable 
are greater than zero. 

•April, 1935 to March, 1936=100. 

’ F .— 10 



146 


statistics: theory and practice 


2. It is based on all the data in the group. 

3. It gives less weight to large items and more to small 
ones than does the arithmetic average. 

4. It is particularly useful when dealing with ratios. 

5. It is amenable to* arithmetic and algebraic manipula- 
tion. 

Disadvantages of the DeomeMc Mean. 

1. It cannot be used when any of the quantities is zero 
or negative; for, when a quantity is zero, the product of all 
quantities will be zero and the g will be zero, and when a 
quantity is negative, the product of all quantities will be 
negative and the g will become unrepi'esentative and 
imaginary^ 

2. It may be found to lie at a point where very few (or 
even none) of the actual measurements lie. 

3. It entails much work of calculation and is difficult of 
computation. 

4. It is less easily understood than the arithmetic average. 

Uses of the Geometric Mean. 

The property of giving large weight to small items makes 
geometric average a very suitable type in studying various 
social and economic phenomena where it is- desired ta give 
large weight to small items. If some items in a series are 
very big and others very small it is not the arithmetic average, 
median or^ode but the geometric mean that yields a represen- 
tative type. If the annual incomes of, say, the employees of a 
university vary between Rs. 180 and Rs. 24,000, geometric mean 
of the incomes will give a good idea of their average yearly 
income. If arithmetic average were used, a single salary of 
Rs. 24,060 would pull the arithmetic mean very high, because 
of the comparatively very low salaries of clerks and peons. 
The geometric mean would nullify the effect which large valued 



STATISTICAL AVERAGES 


147 


have upon the arithmetic average. It may be remembered that 
if in a series the arithmetic and the geometric meanT are Jound 
to dijifer considerably ffofiPeach btherriE^ 
shoilid be regarded as a better representative of the two, since 
it falls within a range of the" majority of the given examples. 

Another important use of geometric mean is in connection 
with index numbers. Index numbers are ratios, and the 
geometric mean is particularly useful in dealing with relative 
as against absolute differences. 

. Geometric mean is used in the construction of the ‘ ‘ Capital ' ^ 
Index of Indian Industrial Activity by the ' Capital’ and of 
“ Wholesale Price Index Numbers of certain articles in India ’’ 
published in the Monthly Survey of Business Conditions in India 
issued by the Office of the Economic Adviser to the Government 
of India. It is used in the Board of Trade Index Number 
of Wholesale Prices in Great Britain. It was used by Professor 
W. S. Jevons in his study of the changes in the general level 
of prices. The difficulty experienced in its calculation and 
the fact that it is too abstract to be readily intelligible have 
, stood in the way of its popularity and general use. It is, how- 
ever, useful in the averaging of ratios and rates of interest. 

THE HARMONIC AVERAGE 

Tbe Hahnonic Average, also called the Harmoinic Mean, 
is the total number of items of a variable divided by the sum 
of the reciprocals of the values of the variable. Symbolically, 



where h stands for the harmonic mean, c, 6, c represent 

the values of the n items of the variable, and n is the number 
of items. Reciprocals of numbers cati be easily obtained from 
a Mathematical Table.^® ’ 

*•. The Mathematical Tables given at the end of the book give reci- 
procals of natural numbers. 



148 


statistics: theoey and feactice 


The harmonic inean can also be expressed as the reciprocal 
of the arithmetic average of the reciprocals of the values of 
the items of a series. Thus, 


A = Reciprocal 



'+— + 


c 

n 


1 


n 


Example 15. Required to calculate the harmonic mean of 
the data given in table 16. 

Table 18. Computation of Harmonic Mean, 


Size of Items 

Rs. 

Reciprocals. 

4.5 

.2222 

250.0 

.0040 

12.0 

.0833 

119.5 

.0084 

30.0 

.0333 

42.0 

.0238 

75.0 

.0133 

35.4 

.0283 


.4166 


According to the formula we have, 

.4166 

.4166 

or ft = Reciprocal — ^ = Reciprocal .05208 
= R8. 19.2. 

The harmonic mean is always less than the geometric 
mean. In the above example : 

a=R8. 71.05 
g=Rs. 39.81 
ft-Rs. 19.2 





STATISTICAL AVEBA6E8 


149 


Characteristiics and Uses of Hamanic Mean. . 

The harmonic mean is determinate and considers the values 
of all items of the data. It gives the largest weight to the 
smallest items, and is valuable where such weighting is desir- 
able. It may be used in averaging of rates and time. It is 
used in very special cases and is not suitable for general appli- 
cation. The time and trouble involved in its calculation also 
stand in the way of its popularity. It is abstract and not easy 
to understand. It may not be an actual example occuring in. 
the series. 

Averages of the First Order. 

The various averages discussed so far are averages of the 
“ first " order — ^that is, they deal with the actual values of a 
statistical variable. In contrast to them we shall later (in 
Chapter XI) study averages of the “ second ” order — ^that is, 
those which summarize not the actual values but the difference 
between them and some average. Averages of the “ first ” 
order can be used as representatives or substitutes of the data 
to which they relate. 

Tjrpical and Descriptive Averaiges. 

It is always arithmetically possible to calculate an average 
from a given series. But, this does not imply that every 
average is statistically significant. If the average of a 
series is found to lie near a point round which the data exhibits 
a tendency to cluster, the average may be presumed to be suffi- 
ciently representative of the series. It is then called a 
“ Typical Average." If, on the other hand, the distribution 
of items is irregular so that the data seem to cluster round 
several points or do not cluster at all, the average has only 
arithmetical significance and should not be considered as fully 
representative of the series. It is then called a “ Descriptive 
Average.” Typical average can be substituted for the series 
for purposes of comparison or for other information relating 
to the series. 



150 statistics: theoey and practice 
Choioel of Average, 

An average is a simple comprehensive expression of a 
series of divergent individual values. All averages do not 
characterize the series in the same way. They yield only that 
information which, by their nature, they are able to transmit. 
This information differs according to the kind of average used. 
Therefore, it is the purpose for which an average is to be em- 
ployed that will largely determine the choice of an average. No 
one average is good for all purposes. Each average is affected 
differently by the distribution, frequencies and the character of 
the details. A knowledge of its peculiarities or characteristics 
is, therefore, a pre-requisite for scientific use of an average. It is 
evident that an average simplifies complexity, but if the parti- 
cular merits, demerits, scope and characteristics of the average 
are neglected, the simplicity arrived at shall not be worth 
having. Caution, foresight and analysis are, therefore, neces- 
sary in the use of averages. If they are ignoied, the very 
principles on which scientific method rests shall be violated. 
This is not desirable. 

What are the desixiabliei properties for m wenge to 
possess? First, it should be rigidly defined; second, it should 
be based on all the observations of the data; third, it should 
be readily comprehensible; fourth, it should be capable of 
being computed with reasonable ease and rapidity; fifth, it 
should be as little affected by fluctuations of sampling as 
possible ; and, last, it should be readily amenable to arithmetic 
or algebraic treatment. 

From a perusal of the advantages and disadvantages of 
the various averages outlined in the foregoing pages it will 
be evident that the arithmetic average — ^the common mean — 
possesses the above properties more than any other single 
average does. It is rigidly defined^ is based on all observa- 
tions, is readily comprehensible, is less affected by fluctuations 
of sampling than, say, the median, and, above all, is suitable 
for algebraic treatment. Of course, median is somewhat more 



STATISTICAL AYE^KAGES 


151 


easily computed than the arithmetic average, but median is 
often indeterminate and its algebraic treatment is difficult, if 
not impossible, in many cases. Mode is hardly useful in 
elementary work owing to the difficulty of locating it with 
precision. Since the arithmetic average uses all the items of 
a given series, it is likely to be less erratic, i.e. less sensitive to 
small change in values of individual items. The arithmetic 
average is, therefore, quite suitable for all general purposes 
unless there is special reason to select any other average. For 
instance, if items of small values are far larger in a series than 
items of large values, arithmetic average will not be a good 
average to use. Instead, the geometric mean will be used. 
And, if it is necessary to give more weight to the smallest 
items than to other ones, harmonic mean will be the proper 
average to compute. Similarly, if enquiry is made into the 
' average ’ size of shoe sold at a shop or an ' average ' coat 
tailored at a tailor’s, it is not arithmetic average but mode 
that will serve the purpose. Again, if an idea of average 
intelligence of a class is to be had, median shall be the best 
average since it can be used even in those cases where the data 
are not quantitatively measurable. These are typical cases 
where arithmetic average, should not, for special reasons, be 
used. In general, arithmetic average is suitable for most 
purposes. 

landtatioos of Averages. 

It is evident that an average is a summary of the details 
of a series. It is used as a substitute for what it replaces. 
But here lies its limitation. Different details may yield the 
same average, yet it is the details which may be of interest. 
An average, if at all it does, rarely contains as much signi- 
ficance as the individual items do. If averages are used alone, 
unsupported by the details, the details, since they are merged 
in the single simple expression, are ignored except in-so-far as 
they are reflected in the summary. Averages, therefore, do 



152 statistics: theory ano practice 

not relate the whola story. They indicate only the central 
position of a group. What lies behind them is not their task 
to reveal. 

It follows that in computing and using an average one 
should know the following things if one is to guard himself 
against a fallacy of argument : 

(1) The purpose of the average. 

(2) The peculiarities of the data to be summarized. 

(3) The characteristics of each average. 

(4) A deep knowledge of the whole subject to which 

the given data relate in order to be certain that 
the average computed shall be significant and 
suitable. 

(5) The extent to which data are homogeneous. 

Standardized Death. Bate. 

If death rate is calculated for each age-group of a locality’s 
population, and then death rate is calculated for the whole 
of the population by the use of weighted average, the latter 
death rate is called Oeneral Death Bate or Crude Death Bate. 
If this crude death rate for a locality is compared with that 
for the standard population (e.g. the population of the country 
at large; or, of another locality assumed to be standard), mis- 
leading conclusions might result. To avoid fallacious com- 
parison it is advisable to eliminate differences between age 
compositions of the populations of the two localities by apply- 
ing the local death rates in each age group to the standard 
population. The following is a simple illustration — 

Example 16. Required to compute crude and corrected 
death rates. 



STATISTICAL AVERAGES 


153 


Table 19. ComputfUion of General and Standardized Death- 
rates, 


Age-group 

Years 

Standard Population I 

A ! 

Local Population 

B 

Population 

Deaths 

Death-rate 
Per 1000 

Population 

Deaths 

Death-rate 
Per 1000 

Under 5 

600 

18 

30^ 

400 • 

16 

40 

5—15 

1000 

5 ' 

5 

1500 

6 

4 

15—65 

3000 

24 

8 

2400 

24 

10 

Above 65 

400 

20 

50 

700 

21 

30 

Total 

5000 

67 

13.4 

5000 j 

67 

13.4 


Cleneral Death Rate of Standard Population^ 

1 

(600 X 30+1000 X 5 + 3000 X 8 +400 X 50) =13.4 per 1,000. 

General Death Rate of Local Population= 

1 

(400 X40+1500 X4+ 2400X10 + 700 X 30) =13.4 per 1,000. 

Upon comparison of the two general death rates, com- 
puted by using the weighted average, nothing remarkable will 
be noted: both the populations have the same deajh rate. 
And, if death rate is any measure of the health of a population, 
both the populations are equally healthy. To justify this 
viewpoint one might add that the total number of inhabitants 
in both the places, 5000, is the same, that the total number of 
deaths in the two cases, 67, is identical, and that the number 
of deaths in the age-group (15 — 65) years is equal in both of 
them. With these arguments one could try to make others be- 



154 


statistics: theory and practice 


]iev€ that both A and B are dually healthy. But it should be 
noted that the death rates in different age-groups in both the 
places are different and also that the distribution of population 
in the» two places into various age-groups is not identical. 
That is, the basis of comparison is not the same. It is, there- 
fore, not fully correct to believe that both the towns are 
equally healthy unless this conclusion is found to hold good 
when the basis of comparison is made identical. To do so, we 
eliminate the differences between age constitutions by assuming 
that the distribution of local population into different age- 
groups is the same as that of the standard population. Then, 
by applying the local death rates to the changed distribution 
we calculate another weighted average death rate, now called 
the Corrected or Standardized Death Rate. Thus, 

Standardized Death Rate of Local Population = 

—^600 X40+1000X4+ 3000X10+400X 30) = 14 per thousand. 


The standardized death rate of iocal population is higher 
than the crude death rate of standard population, leading us to 
conclude that the local population is less healthy than the 
standard population. 

Similarly, there could be a case where the general death 
rates of A and B would have been different, but the death 
rates for different age constitutions the same. This paradox, 
again, would have been due to differences in the distribution 
of population into the various age-groups. The paradox is 
removed by computing the standardized or corrected death rate 
of the local population. 

This method is of general application. We have applied 
it to death rates. We may standardize marriage rates or 
unemployment rates as well. 



6TATISTICAI. AVERAGES 


155 


EXERCISES 

(1) What is an average? How does it differ from a per- 
centage? What purposes does it serve? 

(2) What do you understand by homogeneity of data? Should 
the data from which averages are computed be homogeneous? Give 
reasons. 

(3) Define Mode, Median, Mean, Geometric average and 
Harmonic mean, and clearly explain their uses. 

In which problems can each one of them be used with the 
greatest advantage? 

(4) Compare the advantages and disadvantages of the different 
averages. 

(6) What are the properties that are desirable in an average? 
Which average possesses a majority of these properties? 

(6) How will you locate the mode when the distribution of 
frequencies for class-intervals whose magnitude is one inch gives 
three maxima? Take an hypothetical example to explain the 
whole process. 

(7) How will you locate the median when 

(а) the number of items in a series is even, 

(б) the series is a discrete one, 

(c) only the frequency distribution of a serious given? 

(8) What are Quartiles, Deciles and Percentiles? What in- 
formation do they give regarding a series ? How are they calculat- 
ed in (/) series of individual observations, (ii) discrete series and 
(Hi) continuous series. Show their relationship with the median. 

(9) How will you compute the simple arithmetic average of 

(*) a series of individual observations 
(it) a discrete series, and 
(ifi) a continuous series? 

Explain the direct as well as the short-cut methods. 

(10) Define weighted average, and explain how it differs 
from simple mean. Give the method of its computation and point 
out the cases in which weighted average should be used. 



156 


statistics: theory and practice 


(11) Differentiate between Crude (General) and Corrected 
(Standard) death rates. 

How is the principle of weighting applied to the determina- 
tion of standardized death rates from crude death rates? 

(12) Discuss critically the use of weighted mean in statistics. 

(B. Com., Cal., 1937). 

(13) Explain the significance of ‘ weights.* Is it the absolute 
size of the weights that matters? 

(14) State the formulae of the principal forms of averages 
employed in Statistics, and explain, so far as you can, the principles 
upon which they are based. 

(16) What are the limitations of the uses of each one of the 
different kinds of average? 

(B. Com., Alld., 1939). 

(16) Which average would you use in studying the following 
problems and why? 


(a) Comparing the economic condition of India with U. K’s. 

(b) Size (number of members) of an average family. 

(c) Size of agricultural holding. 

(d) Average marks in an examination. 

(e) Average height or weight of students. 

(/) Average length of the leaves of a tree. 

(f/) Average intelligence. 

(h) Average sales of a shopkeeper. 

(17) Calculate the mode, median, arithmetic average and 
quartiles of the series relating to heights of 63 students given in 
exercise 17, chapter VIIl. 


(18) Calculate the average earnings of labourers from the 
series given in exercise 20, chapter VIII. 


(19) Calculate tlie geometric, harmonic and arithmetic means 
of the series given in exercise 1, chapter VI, 



STATISTICAL AA^RAGES 


157 


(20) According to the census of 1941 following are the 
population figures, in thousands, of first 36 cities of India; 


2488 

591 

437 

208 

213 

143 

1490 

407 

284 

176 

169 

181 

777 

387 

302 

213 

204 

153 

733 

391 

263 

176 

178 

142 

522 

360 

260 

193 

131 

92 

672 

258 

239 

160 

147 

151 


Find the median, arithmetic average and quartiles. 

(22) Compute the mode, median and arithmetic average of 
the following series. Account for their difference. 

Size of item Frequency 


2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 


3 
8 

10 

12 

16 

14 

10 

8 

17 

5 

4 
1 


(23) The following table gives the marks obtained by a 
batch of 25 students in a certain class-test in Economics and 
politics ; — 


Roll Number of the 
Students 

Economics 

Politics 

1 

29 

36 ' 

2 

65 

30 ^ 

3 

33 

38^ 

4 

45 

39 - 

5 

51 

64. 

6 

72 

50 ^ 

7 

48 

46" 

8 

33 

15/ 

9 

42 

42 

10 

25 

10 / 

11 

28 

72 



158 


statistics: theory and practice 


Roll Number of the 

Economics 

Politics 

Students 



12 

35 

33 ^ 

13 

46 

80 

14 

47 

44 ^ 

15 

60 

85 

16 

30 

20. 

17 

32 

32 ^ 

18 

52 

25- 

19 

54 

55-^ 

20 

56 

28^ 

21 

58 

53" 

22 

49 

35 " 

23 

38 

40^ 

24 

40 

62*" 

25 

46 

58- 

In which subject 

is the level of knowledge of the students. 

as revealed from the above figures, higher? 

Give reasons. 



(M.A., Alld., 1937). 

(24) Find out the Mode of the following series: — 

Size Frequency Size 

F requency 

6 

48 13 

52 

6 

52 14 

41 

7 

56 15 

57 

' 8 

60 16 

63 

9 

63 17 

52 

10 . 

57 18 

48 

11 

55 19 

40 

12 

50 

(B. Com.', Alld., 1943). 

Also calculate the median and quartiles of the above series. 

(25) Compute the weighted geometric average of Relative Prices 
of the following commodities for the year 1989 (Base year 1938 — 

Price 100) 


Weight 

Commodity 

Relative Price 

(value produced 



in 1938) 

Com 

. . 128.8 

1,385 

Cotton 

62.4 

819 



STATISTICAL AVERAGES 


159 


Commodity 

Relative Price 

Weight 

(value produced 

Hay 

117.7 

in 1938) 

842 

Wheat 

99.0 

561 

Oats 

130.9 

408 

Potatoes 

143.5 

194 

Sugar 

125.6 

. 142 

Barley 

150.2 

100 

Tobacco 

101.1 

103 

Rye 

116.2 

25 

Rice 

117.5 

17 

Oil Seed 

78.7 

29 


How does it differ from the unweighted geometric mean^ and 
why? 

(B. Com., Alld., 194f8). 

(26) * Statistics help collective agreements of wage 
adjustments What data are required for the consideration of 
a revision in wage rates in a factory, which average will you 
utilize, and why? 

(M. Com., Alld., 1943). 

(27) Compare the relative advantages and disadvantages of 
the Arithmetic mean, the Median, and the Mode. 


The following table gives the results of certain examinations 
of three universities in the year 1936. Which is the best university? 
Give reasons for your answer 


University 

Percentage result in the University 

Examination 

A 


c 

M. A. 

80 

75 

70 

M. Sc. 

70 

70 

60 

B.A. 

65 

80 

70 

B. Sc. 

60 

70 

80 

B. Com. 

75 1 

65 

75 


(M.A., Cal., 1937). 





160 statistics: tHfeOEV AI^D TBACTICE 

(28) The following table gives the number of persons with 
different incomes in the U.S.A. during the year 1929. 

Income in thousands No. of persons in 


of dollars 

Lakhs. 

Under 

1 

13 

1 — 

2 

90 

2 — 

3 

81 

3 — 

5 

117 

5 — 

10 

66 

10 — 

25 

27 

25 — 

60 

6 

60 — 

100 

2 

100 — 1000 

2 


Calculate the average income per head. 

(B. Com., Luck., 1939)- 

(29) The marks obtained by students of classes A and B 
are given below. Give as much information as you can regarding 
the composition of the classes in respect of intelligence; — 


Marks obtained 

No. of students in 
class A. 

No. of students 
in class B. 

5 

— 10 

1 

5 

10 

— 16 

10 

6 

16 

— 20 

20 

16 

20 

— 26 

8 

10 

26 

— 30 

6 

6 

30 

— 36 

3 

4 

36 

— 40 

1 

2 

40 

— 46 

0 

2 


(B. Com., Agra, 1939). 


(30) Explain what is meant by weighted average^ and 
discuss the effect of weighting. 



STATI8TICAI. A^nSRAGES 


161 


Calculate (i) the unweighted mean of the prices in column III 
a:5d (ii } the mean obtained by weighting each price by the quantity 
consumed^ and explain why they differ as they do: — 


I 

II 

Ill 

Articles of Food 

Quantity Consumed 

Price in Rupees 
per maund 

Flour 

11.5 mds. 

5.8 

Ghee 

5.6 mds. 

58.4 

Sugar 

.28 mds. 

8.2 

Potato 

.16 mds. 

2.5 

Oil 

.35 mds. 

20.0 

(M.A., Cal., 1987) 


(31) Explain the short-cut method of calculating the 
arithmetic average. 

The following data relate to sizes of shoes sold at a store 
during a given week. Find the average size by the short-cut 
method : — 


Size of Shoes 

4-5 

5 

5*5 ! 6 

1 

6-5 7 

7*5 

8 

8-s| 9 

9-S 

10 

10 5 11 

No. of pairs 

1 

Li 

4j 5 

iTsjSO 

60 

95 

82 j 75 

j44 

25 

15 4 

1 


^ Cal. 1936). 

(32) The following table gives the population of males at 
different age-groups of the U. X. and India at the time of the 
Census of 1931. 


Age-Group 

U. X. 

India 


Lakhs 

Lakhs 

0—5 

18 

214 . 

5—10 

19 

258 

10 — 15 

20 

222 

15 — 20 

18 

157 

20 — 25 

16 

145 

25 — 30, 

14 

161 

30 — 40 

27 

257 

40 ^ 50 

25 

184 

50 — 60 

19 

120 

Above 60 

17 

100 

Compare the average 

age of males in 

the two countries, and 

account for the difference, 

i£ any. 



(B. Com,, Luck., 1941). 
(B. Com., Alld., 1936). 

• F .— 11 




162 statistics: theorv ani> pracitce 

(33) From the following table calculate the average pri<*e of 
a lb. of biscuits and also the weighted average price. 


Price per lb. 

Rs. A. P. 

lbs. sold 

0 — 10 — 0 

100 

0—16—0 

87 

1— 2—0 

63 

1— 4—0 

59 

1 — 8 — 0 

49 

2 — 0 — 0 

19 


Which of the two averages gives a better indication of the 
average price? 

(34) Following are the lengths in inches of 101 nhn leaves. 
Tabulate them in class-intervals of .5 inches and calculate the 
mode, median, arithmetic average, geometric and harmonic means. 
Which of them represents the series best? 

1.85, 1.5, 1.95, 1.9, 1.6, 2.2, 2.45, 2.72, 2.48, 3.0, 3.7, 3.0, 

2.85, 3.25, 2.48, 3.43, 2.80, 2.35, 2.64, 2.76, 2.9, 2.6, 2.65, 2.95, 

2.70, 2.50, 1.95, 1.95, 1.58, 2.45, 2.92, 2.95, 2.78, 2.60, 2.54, 

2.79, 2.90. 3.05, 2.82, 2.38, 2.90, 2.88, 2.15, 1.75, 2.40, 2.48, 2.15, 

2.65, 2.50, 2.20, 2.40, 2.45, 2.5, 2.56, 2.40, 2.25, 2.30, 1.50, 1.90, 

2.80, 2.88, 2.30, 1.95, 1.85, 2.95, 2.90, 2.00, 2.80, 3.26, 2.95, 3.20, 

2.86, 2.70, 2.77, 2.44, 2.10, 2.54, 2.70, 2.40, 2.65, 2.60, 2.94, 

2.06, 2.06, 2.50, 2.30, 1.90, 2.78, 2.60, 2.35, 2.72^ 2.86, 2.70, 2.60, 
2.63, 1.98, 2.94, 8.05, 2,66, 2.85, 3.10. 

Also calculate the quartiles, deciles and 60th and 37th 

percentiles. 

(35) Calculate the arithmetic average of the above series 

by (0 the direct method, and («) the short-cut method, first, 

of the individual observations and, next, of their frequency distri- 
bution. Compare the results. 

V (36) Calculate the simple arithmetic average of the following 
series by the direct and the short-cut methods: — 


Size of item 

Frequency 

3— 6 

14 

6— 7 

16 

7— 9 

25 

9—11 

22 

11—13 

12 



grrATISTICAI. AVERAGES 


163 


Also calculate the median and the mode and compare the 
results. 

(37) Compute the arithmetic, geometric and harmonic means 
of the following items. 

376.5, 16.3, 28.6, 12.01, 4.6, 3.7, 12.79, 36, 41.9, 63. 

(38) If in ten successive years the quantities 1, 2, 3, 4, 6, 6, 
7, 8, 9, 10 are sold at prices 1, 2, 3, 4, 6, 6, 7, 8, 9, 10, what are 
the weighted average and simple arithmetic average prices.^ 

^ (39) Find the median, mode and arithmetic average of the 
daily wages in the following series, and state which of these repre- 
sents the series best. 


Daily Wages in 
Annas 


3 

6 

7 

9 

11 

13 

16 

17 

19 


Frequency 
(Number of 
employees) 
2 

. . 10 
. . 12 
. . 15 

. . 20 
. . 13 

. . 12 
10 
4 


(40) Show* the relative positions of different averages in moder- 
ately symmetrical series. 

Find the mode, median and arithmetic average of the follow- 
ing series and state if the series is s^Tnmetrical. 


Sis^e of item 


Frequency 


10 


. . SO 

11 


. . 36 

12 


. . 38 

13 


42 

14 


. . 46 

16 


. . 42 

16 


. . 38 

17 


. . 36 

18 

, , 

. . so 



lU 


statistics;: theory avd practice 


(41) Which of -the two places for which the mortality data 
are given below would you describe as more healthy, than the other ? 
Give reasons. 


Age in 

Town X 

Y I 

Years 

Population 

Deaths 

Population 

Deaths 

under 10 

16,000 

375 

10,000 

300 

10—30 

60,000 ' 

250 

62,000 

312 

30—70 

120,000 

840 

126,000 

1,008 

.. above 70 

15,000 

976 

12,000 

840 

(42) Compute the Crude and Standardized death-rates in 

the following. 

and state if local population has higher or lower 

death-rate. 





Age group 

Standard Population 

Local Population 

(years) 

Population 

Deaths 

Population 

Deaths 

under 5 

6,000 

150 

2,500 

63 

5—15 

10,000 

20 

12,500 

25 

15—65 

12,600 

60 

20,000 

80 

Above 65 

4,000 

160 

6,000 

200 

(43) Value of Exports & Imports of Commodities for India 

for 1934.-3‘5. 





Months 


Exports Imports 



Crores of Rs. Crores of Rs. 

April 


12,7 


10.9 

May 


13.3 


10.6 

June 


12.6 


9.5 

July 


12.8 


9.9 

August 


12.8 


10.7 

September 

12.1 


10.5 

October 

12.4 


12.6 

November 

12.3 


11.4 

December 

12.2 


10.3 

January . . 

13.7 


12.9 

February 

13.2 


10.6 

March 

. . 

16.6 


12.4 


Total 

. . 156.1 


132.2 








STATISTICAI. AVEKAGKS 


165 


Calcuhite the median, arithmetic average and the geometric 
mean of the above figures of exports k imports separately. Which 
of the averages represents the series best? 

(44) The following are the weekly market values of th^. shares 
of ' Imperial Bank of India ’ (paid up value Rs. 500) from 
Jan. 4, 1933 to Dec. 20, 1933. 

1335, 1235, 1236.5, 1261.5, 1266.6, 1166, 1190, 1176, 1160, 
1186, 1221.5, 1220, 1234, 1230, 1235, 1236.6, 1232.5, 1251.3, 1244, 

1231.6, 1216.5, 1216.6, 1221.5, 1221.5, 1205, 1184, 1196, 1212.6, 
1207, 1192, 1196, 1206, 1201, 1196, 1198, 1208.6, 1202.5, 1234.6, 

1234.6, 1226, 1230, 1220, 1220.5, 1220,5, 1240, 1252.5, 1246, 
1246.5, 1246.5, 1230.6. 

(from the ‘ Capital ’). 

Calculate the mode, median, mean, geometric and harmonic 
averages of the above series. 

(45) The following table gives the age distribution of widows, 
in India, (Census Report 1931). Calculate the median age of the 
widows and also the upper and lower quartiles. 


Years 


No. of widows 

0—10 


136,862 

10—20 


718,101 

20—30 


2,466,835 

30 — 40 


4,847,631 

40—60 


6,480,269 

50—60 


6,908,169 

60—70 


3,748,616 

70 & over 


1,967,606 


Total 

26,247,968 


(46) The following table gives the marks obtained by a batch 
of 80 B. Com. students in a class test in Statistics. (Marks 100).. 



166 


statistics: theory and practice 


. Roll No. 

Marks obtained 

Roll No. Marks obtained 

1 

33 

16 

%4 

2 

32 

17 

83 

3 

55 

18 

42 

4 

47 

19 

38 

5 

21 

20 

45 

6 

50 

21 

26 

7 

27 

22 

33 

8 

12 

23 

44 

9 

68 

24 

48 

10 

49 

25 

52 

11 

40 

26 

30 

12 

17 

27 

58 

13 

44 

28 

37 

14 

48 

29 

38 

15 

62 

30 

35 

Find the values of the Mode. 

the Median, 

and the Quartiles. 



(B. Com., Alld., 1988). 

(47) Calculate the arithmetic average and the geometric mean 

of the series given 

in exercise 46 above. Which 

average represents 

the series better.^ 






(B. Com., Alld., 1988). 

(48) The following table gives the distribution of population 

according to age 

in India and 

Japan at the 

time of the last 

census (1931): 




Age group 


Population in millions in 

in years 


India 

Japan 

0—10 


98.9 

17.8 

10—20 


72.5 

14.3 

20—30 


63.2 

11.3 

30 — W 


48.6 

8.6 

40—50 


32.6 

6.5 

50 — 60 


19.4 

5.4 

60—80 

. . 

13.2 

5.1 


Calculate the average age of people in India and in Japan, and 
comment on the difference. 

(B. Com., Alld., 1940). 
(49) In order to decide whether one city is more healthy than 
another on the basis of death-rates what information would you 



STATISTICAI. AVERAGES 


16 T 


require in addition to the total number of deaths and the total 
population of the two cities ? How would you use this informati<m 
to decide which city is more healthy? 

(M.A., Alld., 1985), 

(50) Calculate the simple average and the weighted average 
of the following items: — 

















Account for the difference in the two averages: 

(M.A., Alld., 1940). 

(61) The following is. the distribution of wages per thousand 
employees in a certain factory: 


Daily wages 
in Annas 

2- 

4. 

6- 

8- 10- 

12-14- 

16- 

18- 20- 22- 24- Total 

Number of 
employees 

o 

13 

43 

' 1 i 

102,175 1201204 

139 

L_ 

69 25 6 i 1 1000 


Calculate the modal and median wages, and explain why there 
is a difference between the two. 


(M.A.. Alld., 1940). 

(52) Calculate the mode, median and arithmetic average of the 
following series and state which of them represents the series best. 

Size of item Frequency 


6—10 . . . . 20 

11—15 .. 30 

16—20 . . . . . . 50 

21 — 26 . . . . 40 

26—30 . . 10 


Also calculate the quartiles, deciles and 20th and 80th per- 
centiles, and point out what light they throw on the series. 

(63) The followii^ table gives the frequency distribution of 
the weights of students in a class. Find the median and the mode* 
Which of them represents the series better? 





168 


statistics: theory and practice 


Weight in lbs. 


Frequency 

Below 80 


10 

80—90 


10 

90—100 .. 


25 

100—110 


50 

110—120 


48 

120 — lao . . 


32 

Over 130 

Total 

20 

201 


(54) Point out the ambiguity or mistake^ if any, in the follow- 
ing statements: 

(i) There are 260 employees in a sugar mill. Their daily 
earnings are about Re. 1/- per man on an average. 
Therefore, their total monthly earnings are Rs. 7,500. 

(«) An ordinary person in India consumes one chhatak of 
pulse per day. Therefore, the total quantity of pulse 
consumed by India’s 40 crores of people every year 
is about 23 crores of maunds. 

(til) The monthly expenditure of the vast majority of 
students in an university is Rs. 50. Therefore, the 
total . monthly expenditure of 2,000 students is 
Rs. 1,00,000. 

(iv) A cloth dealer usually receives 150 customers a day. 
Therefore, the total number of customers he receives 
in a month is 4,500. 

(65) The following table gives the value of imports of com- 
modities into India in crores of Rs. 


Months 

1934-35 

1935-36 

1936-37 

April 

10.9 

11.6 

10.1 

May 

10.5 

11.8 

10.0 

June 

9.6 

9.9 

9.8 

July 

9.9 

10.1 

10.1 

August 

10.7 

11.2 

9.3 

September 

. . 10.5 

10.2 

9.6 

October 

. . 12.6 

11.8 

10.7 

November 

11.4 

12.7 

10.6 

December 

10.3 

10.6 

9.9 

January 

. . 12.9 

13.1 

12.6 

February 

. . 10.6 

10.5 

9.3 

March 

12.4 

10.8 

13.1 



STATISTICAL AVERAGES 


1^9 


Calculate : — 

(a) The average import into India for each month for the 

whole period. 

(6) The geometric mean and harmonic mean for 1934-35. 

(c) The median and mode for 1936-37. 

(56) Calculate the average percentage increase per decade 
in the population of Itfdia from 1881 to 1941 from the figui^s 
given in exercise 11, chapter IX. 

(67) A railway train runs for 60 minutes at a speed of 40 
miles an hour and then, because of repairs of the track, runs for 
10 minutes at a speed of 10 miles an hour. What is its average 
speed f 

(58) Differentiate between 

(fl) Typical & Descriptive averages 

(b) Averages of the first order k of the second order. 

(59) Amend the following table, and locate the median from 
the amended table. Also measure the magnitude of the median so 
located. 


Sizes 


Frequency 

10 — 16 . 

, , 

10 

16—17.5 . . 


15 

17.5—20 


17 

22 — 30 


25 

30 — 35 . . 


28 

35—40 


30 

45 & upwards 

. . 

40 


(B. C^m., Alld., 1942). 

(60) Monthly incomes of twenty families are given below in 
rupees : — 

2,000; 36; 400; 16; 40; 1,500; 300 ; 6; 90; 260; 20; 12; 
450; 10; 150; 8; 25; 30; 1,200; *60. 

Calculate the Geometric Mean aiid the Harmonic Mean of the 
above incomes. 

(B. Com.;Alld,, 1941). 



CHAPTER XI 

DISPERSION AND SKEWNESS 

DISPERSION 

Meaning of Dispersion. 

Averages of the “ first order ”, discussed in the last 
chapter, consider only the central position of a series. They 
do not throw light on the fwmatipn of the series. They fail 
to characterize the detail from which they are made up. 
Hardly ever are the various items of a series equal to the value 
of the avemge computed from them. Some measure of the 
differenoes" of the items from their average is necessary. 
Averages of the ” second order ” provide this measure. By 
Iheir tise, an average or a type, not of all the items oflfcfe 
series, but of their differences from an average is obttiliM. 
In averaging these differences their irregularitie.^ are brushed 
off, and a type, a representative figure, results. 

All freqiieney distribxitions ai’e not similar. They may 
differ in the numerical size of their averages, or they may have 
the same values of their averages yet differ in their respective 
formations. Let us suppose that the daily earnings of A and B, 
two mechanics, during the siic days of a working week are 
as given below; 


Days 

A’s earnings 
1V8. 

B’s earnings 
Rs. 

Monday 

4 

3 

Tuesday 

4 

4 

Wednesday 

5 

5 

Thursday 

5 1 

f) 

Friday 

6 i 

; 0 

Saturday 

6 

t 

6 days 

30 ” 

Rs. 30 


170 






DISPERSION AND SKEWNESS 


171 


The total earnings made in 6 days, Rs. 30, and therefore, 
also the average earnings, Rs. 5 per day, are thus exactly the 
same in both cases, but the s catter edness o f the values of tlie 
items of the series round their average is different in the two 
cases. The greatest deviation from the average in A’s ease is 
Be. 1 and in B’s case Rs. 2. The two series are, therefore, 
differently constituted, though they result in averages of the 
same numerical value. It follows that averages must be used 
with great caution. To these cautions belongs the measure- 
ment of the dispersion or scatteredness of the series around the 
mean. The value of the mean depends necessarily on the dis- 
tribution of the items around it and on its position in the 
series. An examination of this distribution furnishes us with 
a valuable supplement to the information given by the mean 
itself: it tells us how the items comprised in a group vary in 
size. This helps us in finding the extent to which the average 
is ‘ typical \ 

Ttaitarm Dispersion is used in Pmo senses in Statistics 

One sense is general^ implying that within a given group the 
items are not uniform in their size. That is. they differ in their 
magnitude. This difference may be gi*eat or small. Accord- 
ingly, dispersion may be considerable or slight. If the profits 
of a given number of businesses in the same trade and with the 
same capital are found to vary between Rs. ll.UOO and 
Rs. 11,020, they are scattered over a small range. That is, they 
are fairly consistent, or in other words, their variability is 
slight. If, however, the profits vary between Rs. 1,100 and 
Rs. 11,000, consistency is wanting, the range is wide, or in 
other words, variability is considerable. The other in 

which the term di 3 persion is used is more precise. In this 
sense it indicat an ajbsolute or relatiye^measure of the differ- 
ences of the item s of a group from some average computed 
from those items^ It ma y be no^d th at the d ifferenc e betwe en 
the measurements of the value of a variable and its mea%. or 
any other fix^ point, is technicid ly t ermed deviatiiqiaL And, a 



172 


STATrSTXC$; THEOKV AND tKACTiCE 


measure of the deviations of the size of items from an averajfe 
is called the Measure of Dispersion, 

Measures of Dispersion. 

The two senses in which the term dispersion is used are 
important. The first sense points to the limits within which 
data fall; the second sense calls attention to the amount, 
absolute or relative, by which the values of the items differ 
from an average or type. The two senses are different. Dis- 
persion, in the first sense, is indicated by the method of limits, 
where the complete range or the items may be shown. Disper- 
sion, in the second sense, is expressed by the method of 
BYeraging diffleTenoes from a type. 

Method of Limits. 

The most common way of measuring dispersion under this 
method is that of computing the Range. 

The Range. 

Range of dispersion represents the diflhrence between 
the values of the extremes^ i.e. the largest and the 
smallest items, of the data under review. If in a 

certain class the height of the shortest student was 4 ft. 
10 in., and that of the tallest 6 ft., the range would evidently 
be 14 inches. Range is thus the simplevSt method of measuring 
dispersion; but it is too indefinite to be used as a practical 
measure of dispersion, since it depends entirely upon the values 
of the extreme items. For instance, if a dwarf whose height 
was only 3 ft. 6 inches was admitted to the class, in the above 
example, the range would suddenly rise to 30 inches, while the 
average height of the class would not^be materially affected. 
There is yet another reason why the range is not a satisfactory 
measure of dispersion. It is that the range does not take into 
amount the distribution of t he items within its limits.^ This 
distribution may vary widely, even though the extremes be of 
the same value. We might get the same value for the range 
from a symmetrical and a J-shaped (i.e.‘ asymmetrical) fre- 



DISPERSION AND SKEWNESS 


173 


quency-curve. CJearly, two such distributions oannot he 
regarded to exhibit the same dispersion. . On the other hand, 
series with different maxima and minima may have practically 
the same distribution or dispersion. 

Besides, the range^is an a bsolute m easure o f ^sper^ ion . 
and therefore, it does not make if "i)6ssible to compare the 
relatiw dispersion of two series expressed in different units. 
Absolute dispersion measured, say, in yards, is not comparable 
with dispersion measured in t^ons. If the absolute dispersions 
are reduced to relative bases, comparison would be posiSiMe. 
This is done hg dividing the range by the sum of the .eiWtrdne 
items. The quotient is called the eo-effioient or lutib* of 
dispersion. 

In our example of the height of students taken above, the 

14 

range is 14 inches, and the co«efficient is =.108. 

130 

Method of Averaging Deviatioiis. 

Under this method the most common measures are (1) the 
Average or Mean Deviation, (2) the Standard lieviajtion, and 
(3) the Quartile Deviation. These are absolute measures. 
They are converted into relative measures for purposes of 
rendering comparison between series measured in different 
units possible. . , . 

(1) First Moment of Dispersion or Average Deviatiim and 
its Co-effident. 

The first moment of dispersion, also called the mean 
or average deviation, is the arithmetic average of the devia- 
tions of the group measured from an average^ (Median, Mode 
or Mean) taking all deviations as positive. In other words, 
it is the sum of the deviations from an average divided by the 
niimber of items. It is necessary to treat all deviaticmA as 
posHive, since the sum of the deviations from the arithmetic 
average taken with minus and plus signs is zerp^ and that from 
the median and the mode is nearly zero. 



174 


STATLSTICS: TUYJOUY AND PRACTICE 


Let d stand for deviation, i.e., difference between an indi- 
vidual item and the average, and x for the value of an indi- 
vidual item. Then, 


d — x— a = deviation from the arithmetic average, 
rfm — A/ = deviation from the median, 

dz Z = deviation from the mode. 

If 71 be the number of items in a series, 

%d 

— = The first moment of Dispersion from the Arithmetic 
n 

Average, represented by S 

^;j;^=The first moment of Dispersion from the Median, 
represented l)y 6, 


— = The first moment of Dispersion from the Mode, 
represented by 


It is proper to calculate the mean deviation from the 
median }>ecause the sum of deviations, and consequently the 
mean deviation, is least when median is chosen as the origin 
from which deviations are measured. In practice, however, it 
is easier and not un-satisfactory to calculate it from the arith- 
metic average. In case the observations are recorded in 
groups between different limits, mean deviation from the 
median is difficult to calculate with precision, and therefore, 
arithmetic mean rather than the median may be chosen as the 
origin. It is not a common practice to calculate tlic mean 
deviation from the mode. 

Mean deviation gives us the absolute measure of disper- 
sion. This is one factor that is required for calculating the 
relative measure of dispereion, called Mean Co-efiElciieiiit of 
Dispersion. The other factor required is the mean used in 
the particular case. Thus, 

= Co-efficient of Dispersion fi*om the Arithmetic Average, 

a 



DISPERHION AND 8KEWNES8 


I7r, 


=Co-pfficient of Dispersion from the Median,, and 
M 

^ 7 . 

— ' =Co-eflficient of Dispei*sion from the Mode. 


Calculation of Mean Deviation and its Co-^cient. 


Example 1. Required to find mean deviation and its co- 
efficient when individual quantitative observations are given. 

Table 20. Calculation of Mean Deviation of X's monthly 
earnings for a year. 


1 

Months ' 

Monthly 

Earnings 

Deviations from median 
( + &— signs ignored) 
(Rs. 42) 


m 

d m 


Rs. 

Rs. 

1 

39 

! 3 

2 

40 

2 

3 

40 

2 

4 

41 

1 

5 

41 

J 

6 

42 

0 

7 

42 

0 

8 

43 

1 

9 

43 

1 

10 

; 44 

' 2 

11 

i ^ 

2 


1 4.-> 

i 3 

n = 12 

2dm =Rs. 18 1 


Median or Af= Value of 


(t-> 


“‘item 


= Value of 6.5th item— Rs. 42. 

_%dm 

Mean Deviation from the Median or Sm 




176 statistics: theory and practice 

Co-efficient of Dispersion from the Median^-- — 

M 42 

= .0357 approximately. 

Since the values of arithmetic mean and median are the 
same in this example, mean deviation from the arithmetic mean 
and its co-effiy^ient of dispersion will also have the same values 
as those from the me<iian. We can now state that the mean Or 
median earnings of X are Rs. 42 and the earnings deviate from 
the mean or median on an average by Rs. 1.5. 

Example 2. Required to find mean deviation and its co- 
efficient when a discrete series is given. 


Table 21. Calculation of Mean Deviation. 




Deviation from 


Size of 
item 


Median 

Total Deviation 

Pi-equency 

(5) 

(Frequency X 


( + &— signs 

deviation) 


f 

ignored) 


m 

dm 

/d m 

2 

2 

3 

6 

3 

3 

2 

6 

4 

5 

1 

5 

5 

8 

0 

0 

6 

6 

1 

6 

7 

4 

2 

8 

8 

2 

3 

6 

9 

1 

4 

4 


n=^31 


Sdm =41 


Median or Af=the value 



item, 


=the value of 16th item, 




DISPERSION AND SKEWNESS 


177 


Mean Deviation from Median or 


%dn 


:11 

31 


= — =1.32 


Mean Co-efficient of Di&persion.= 


M 

1.32 


= .264 


Example 3. Required to find the mean deviation and its 
co-efficient when data are composed of a continuous series, 
measuring the deviations from the median as well as from the 
arithmetic average. 

Table 22. Calculation of Mean Deviation of marks of 60 
students in Economics, 







Deviation 





Deviation 

Total 

Deviation 

from 

Median 

from 

Total 

Marljs- 

group 

Mid- 

value 

Fre- 

quency 

from Median 
(35 marks) 
(- 1 - fie — signs 

Arithmetic 
Average 
(34.7 marks) 

Deviation 

from 

Arithmetic 




ignored) 

( -j- fij — signs 

Average 




fdm 

ignored) 

fd 


m 

f 

dm 

d 

0—10 

5 

4 

30 

120 

29.7 

118.8 

10—20 

15 

8 

; 20 

160 

19.7 

157.6 

20—30 

25 

11 

10 

110 

i 9.7 

106.7 

30—40 

35 

15 

1 0 ! 

! 0 

i 0.3 

4.5 

40—50 

45 

11 

10 i 

i 110 1 

10.3 

113.3 

50—60 

55 

7 

20 

140 

20.3 

142.1 

60—70 1 

65 

4 

30 

120 

30.3 

121.2 



71=60 


f 60 


r<£=764.2 


(i) Median or M=35 marks, by interpolation in (30-40) 
marks group. 


Mean Deviation from Median or — 




760 


marks 


Co-efficient of Dispersion from Median— ^ - 

(ii) Arithmetic Average or a=34.7 marks. 
. F.— 12 


n 60 
= 12.67 marks, approx. 
dm 12.67 


35 


.36 approx. 




178 


statistics: theoky and puacttoe 


Mean Deviation from Arithmetic Average or S= — = marks 

^ n 60 

= 12.74 marks. 

Co-efficient of Dispersion from Arithmetic Averages — 

a 34.7. 

= .37. 

Tho above example amply demonstrates that mean devia- 
tion when measured from the median is least, 8 m being less 
than 8 in this case. It would also be less than 8/, or mean 
deviation from any other point. 

Characteristics and uses of Mean Deviation and its Co-efficient. 

Mean deviation and mean co-efficient of dispersion are 
easy to calculate and comprehend; they take all items into 
consideration and give weight to deviations according to their 
size. The co-efficient is usefully employed in economic studies 
like the distribution of personal wealth in a community where 
the rich and the poor both are considered. But the mean 
deviation does not lend itself readily to algebraical treatment. 
Other moments of dispersion have, therefore, come into use. 

(2) Second Moment of Dispersion, Standard Deviation 
and its Co-efficient. 

An alternative method of eliminating the algebraical 
signs of the deviations from an average is to square up each 
deviation. This method is employed here. Second moment of 
dispersion is the sum of the squares of the individual deviations 
from the arithmetic average divided by their number 

\fjt2 

viz., - — . Standard deviation is the square root of the Second 
n 

Moment^ viz., \/ Second Moment or rei)resented 

^ » 

by cr (sigma). Standard Deviation is an absolute measure of 
dispersion and is invariably computed from tlie arithmetic 
average, since it is least when arithmetic average is chosen as 
the origin from which deviations are measured. To compute 

the Standard Co efficient of Dispersion, a relative measure, the 



DISPEKSION AND SKEWNESS 


179 


Standard deviation is divided by the arithmetic average, 
a 

viz.^ . 
a 

Similar moments and absolute measures of dispersion based 
on mode and median are quite conceivable. Second moment 
of dispersion computed from any mean or value other than the 
arithmetic average is sometimes termed Mean Square Deviation, 
and the absolute measure of dispersion based on such second 
moment is designated Boot-mean Square Deviation. But root- 
mean square deviation is converted into standard deviation 
before it is used. Since the sum of the squares of the devia- 
tions from the arithmetic average is minimum, it is obvious 
that standard deviation is that root-mean square deviation 
whose value is the least, and second moment is that mean 
square deviation whose value is minimum. 

Calculation of Second Moment, Standard Deviation and its Co- 
efficient. 

Direct Method. 

Example 4. Required to find standard deviation and its 
co-efficient when quantitative individual observations are 
given. 

Table 23. Calculation of Standard Deviation of X^s Monthly 


Earnings for 12 months by the direct method. 


Months 

Monthly oamings 

m I 

Deviation from 
Arithmetic 
Average (Rs. 42) 
d 

Square of 
Deviation 

d* 

1 

39 ! 

—3 

9 

2 

40 

4) 

4 

3 

40 

—2 

4 

4 

41 

—1 

1 

5 

41 

—1 

1 

6 

42 ; 

0 

0 

7 

42 * 

0 

0 

8 

43 

+1 

1 

9 

43 

-fl 

1 

30 

44 1 

+2 

4 

11 

44 1 

+2 

4 

12 

45 j 

„.+3 

1 9 

n=:12 

2:»i=R8.^ j 
<c~:Rb. 42 1 

1 

i7rf*=38 




180 


statistics: theory and pracitce 


Arithmetic Average or 0 = *“^ =Rs. 

® n 12 


= Rs. 42 


^ J2 

Second Moment of Dispersion — = — =3.17approximateIy. 

12 


Standard Deviation or <r 


n 

n 


zyj 3.1 


7=Rs. 1.78 approx. 


(F 1 78 

Standard Co-efficient of Deviation = — =-^ — =.042 approx. 

a 42 

We can now state that the arithmetic avoraj^e of the given 
series is Rs. 42 and its standard deviation is Rs. 1.78. 

Example 5. Required to find standard deviation and its 
co-effi'cient when a "discrete series is given. 

Table 24. Calculation of Standard Deviation by the direct 
method. 




Sum of sizes 



Product of 

Size of 
items 

Frequency 

(Frequency 

X size of 
item) 

Deviation 
from mean 

Square of 
deviation 

square of 
devi&tion 
and ire- 





quency 

m 

f 

mf 

d 

d‘ 

fd‘ 

2 

1 

2 

+6 

36 

36 

4 

2 

8 

-f* 

16 

32 

6 

3 i 

i 

-f2 

1 4 

12 

8 

0 1 

1 40 

; 0 

0 

0 

10 

3 1 

1 30 

1 2 

4 

12 

12 

2 I 

i 24 ! 

' —4 

16 

32 

14 

1 

i 14 

—0 

36 

36 


»i=17 I 

X’m=136 ; 





0=8 i 



2:<I>=160 


Arithmetic Average or a=“— = ^ =8 

o . -Sd" _160 

Second Moment of Dispersion — —— =9.4 

n 17 

Standard Deviation or or=\/^5^ =:\l 9.4= 3.066 

' n, » 


Standard Co-efficient of Deviation =— = =r 

a 8 


.383 





DISPERSION AND SKEWNESS 


181 


We can state that the arithmetic average of the given series 
is 8, and its standard deviation is 3.066. 

Example 6. Required to find the standard deviation and 
its co-efficient when a continuous series is given. 

Suppose the class-intervals of a given series are 1-3, 3-5, 
5-7, 7-9, 9-11, 11-13, 13-15 and their respective frequencies are 
1, 2, 3, 5, 3, 2, 1. Then, the class-intervals will be placed in the 
first column of a table, their mid-points 2, 4, 6, 8, 10, 12, 14, in 
the second column, which tally with the size of items of table 24, 
example 5. The entire working of the example will thereafter 
be the same as that in example 5, resulting, in this particular 
case, in the same values of arithmetic average, second moment, 
standard deviation and standard co-efficient of dispersion. 

The preceding direct method of computing the standard 
deviation is easy and simple if the arithmetic average of the 
series is a whole number, as it was in examples 4, 5 and 6. 
But, often the arithmetic average happens to be a fraction or 
contains a fraction. Then, the deviations from the true mean 
would also be fractions or contain a fraction. Their calcula- 
tion and squaring up will be tedious. In such cases the short- 
cut method of computing the standard deviation can be use- 
fully employed in place of the direct method. 

Short-Cut Method. 


The short-cut method of computing the standard deviation, as 
we shall presently see, saves valuable time and tiresome effort. 
It gives the same value for the standard deviation as the direct 
method does. The procedure is as follows : Assume any size 
of item as an average ; compute deviations from it ; square each 
deviation ; summate ; subtract from the summation n times the 
square of the difference between the assumed average and the 
true average; divide by n and extract the square root of the 


quotient. 

The algebraic formula used in this method is: 


(a - r)* 





'j-kr -> ^ 




182 statistics: itikory and PRAcriCE 

where, x represents assumecl average^ a, actual average, cPx 
squares of deviations from assumed average, and n, number 
of observations. 

Example 7. Required to calculate standard deviation of a 
continuous series. (The procedure worked out in this 
example shall also apply to series of any other type.) 


Table 25. Calculation of Standard Detnation by the slwrt cut 
method. 


(a) 

(b) 

(C) 

; (d) 

(0 

(f) 

(g) 


j 

Deviation 


Product of 

Size of 
item 

Mid- 

value 

Fre- 

quencj 

Sum of sizes 
[col. (b) Xcol. (c)] 

I 

from the 
assumed 
average 

Square 

of 

Dev. 

Square of Dev. 

& Frequency 
[col. (f)Xcol. 




! mf 

(5) 


(«)i 


m 

f 

d 

1 

fcP 

1.5-2.5 

2 

3 

1 6 1 

~3 ! 

9 

27 

2.5— 3.5 

3 

4 : 

12 ' 

-2 i 

4 

! 16 

3.5— 4.5 

4 

5 

20 

-1 1 

1 

5 

4.5— 5.5 

5 

S f 

40 

0 i 

0 

0 

5.5— 6.5 

6 

7 ■ 

42 ! 

+1 

1 

7 

6.5— 7.5 

7 

6 

42 

+2 

4 

24 

7.5— 8.5 

8 

3 

24 

+3 i 

9 

27 



n = 36 

51/71=186 

1 

1 

SfCx =106 


True Arithmetic Average or a = =5.17 approximately 


Let the Assumed Average or x=5: (a — — 5 1 

V 36 / 36 

Therefore, a= \| 


ISd^x — n 

n 


^/ 106 ~ 36 (^,) , , 

=s\ — 1.71 approximately. 

00 


1 71 

And, Standard Co-efficient of Dispersion = — — =.33 approx. 

a 5.17 


* The assumed average, x, is a whole number approximating the actual 
average. That value which has the maximum frequency may also be 
taken as the assumed average, or the working mean as it is often called i 





DISPERSION AND SKEWNESS 


183 


Gharacl^ristics and uses of Standard Deviation and its Co- 
efficient. 

Standard deviation and its co-efficient possess all 
those properties which a good nieasm^ of dispersion should. 
The process of squaring the deviations eliminates negative 
signs, and thus makes mathematical manipulation of figures 
easy. Largely for this property, standard deviation has been 
used by biologists. It has not found its favourites among eco- 
iKunists for two main reasons. Firstly, the squaring of devia- 
tions gives greater weight to extreme items than it does to 
those differing only slightly from the arithmetic average. This 
factor has hardly any value in economic studies, since the eom- 
)nercial or ec^>nomic statistician is interested in the results of 
the modal class. Secondly, the computation of standard devia- 
tion requires considerably greater time and effort than that of 
mean deviation. For the businessman rapidity in the prepara- 
tion of results is an important factor. Therefore, in economic 
and commercial studies there is a tendency to qse mean devia- 
tion unless there is a particular reason for using the standard 
deviation. 

But, standard deviation enjoys at least two decided ad- 
vantages over the mean deviation. Firstly^it. is^jn general^ 
less affected by fluctuations of sampling, and secondly, it is 
more easily anaenable to jjgebraical processes. These two 
properties make standard deviation useful for advanced work. 
Its use is, therefore, increasing for measuring variability. The 
standard deviation, for instance, has a special use in the com- 
putation of Karl Pearson’s Co-efficient of Correlation, which 
will be discussed in a later chapter. 

Modulus. 

It is another measure of dispersion based on the 
second moment of dispersion. It is generally represented by 

in 

c. The formula iised is 

' n 



184 


statistics: theory and practice 


Varianoe. 

The quantity is known as Variance. 

Oo-effieient of VariatioiiL. 


According to Karl Pearson, who first suggested its use, 
co-effi'cient of variation is the ‘ percentage variation in the 
mean, the standard deviation being treated as the total varia- 
tion in the mean.’ In other words, 


co-effi'cient of variation or i;=100 


Standard Deviation 


Arithmetic Average 

= 100 X Co-effi'cient of Standard Deviation. 
Thus, in example 8, co-efficient of variation = 100 X. 33 = 33. 
Co-efficient of variation is a relative measure of dispersion and 
has come into use largely. 


(3) Quartile Deviation or Semi^Interquartile Bange and 
its 0(Kofficient. 


The measures of dispersion discussed under Fit's! and 
Second Moments of Dispersion take into account the deviations 
of all the items. Quartile deviation, based on the quartiles, 
affords a general idea of the dispersion of a group without 
considering each particular item. The first and the third 
quartiles include between them the middle-half of the items 
of a group. If the diversion of this half could fairly represent 
the dispersion of the whole group, a simple method of measur- 
ing it will be : ^ 


Quartile Deviation or Q, />.=^^ 

where stands for 3rd or upper quartile, 
and Qt stands for 1st or lower quartile* 


Quartile deviation is an absolute measure of dispersion. 
Its relative measure, that is, Quartile Ck^fflcient of Diapercdoin, 
is Quartile Deviation divided by the average of the two 
quartiles. 



DISPERSION AND SKEWNESS 


185 


Quartile Co-eflficient of Dispersion = 


Qi 

2 


Calculation of Quartile Deviation and its Co-efflcient. 

Example 9. In •example 1, table 20, ^3 = Rs. 43.75 
^i = Rs 40.25. 


and 


• X 43.75-40.25 

Quartile Deviation = Rs. = Rs. 1 .75 


and Quartile Co-efficient of Dispersion = 


43.75-40.25 


43.75+40.25 


= .042 approximately. 

Upon the assumption that median lies half-way between the 
upper and the lower quartiles, we may observe that in our 
example 


Q^^Qi o 43.75+40.25 „ 

Median = = Rg. = Rs. 42, 

2 2 

and the difference occurring on either side of it is Rs. 1.75. 
In other words, median is Rs. 42 and half the items are within 
the range Rs, 42dil.75. 


Characteristics and uses of Quartile Deviation and its Co. 
efficient. 

Quartile deviation and its eo-efficient are simple to compre- 
hend and easy to compute. They are quite satisfactory if one is 
concerned with the main body — the middle half — of the series 
and cares nothing about extreme variations. Quartile deviation 
has a serious drawback in that its value may be the same for 
series whose quartiles are the same, whatever the distribution 
of the observations between the quartiles and beyond them. 
This is because it considers only the quartiles and not each 
particular item of the array. It is, therefore, not so sensitive 
as the mean and the standard deviations. 


Choice of llleadnires of Dispersion. 

A good measure of dispersion should possess some such 
qualities as an ideal average does. That is, it should be based 



186 


statistics: theory and practice 


on all the ohservations made, easily ealculated^ readily under- 
stood, affected very little by fluctuations of sampling, and 
amenable to algebraical treatment. The range, it has already 
been seen, is not a satisfactory measure, and its co-ejficient, 
therefore, is not very much favoured by statisticians. Quartile 
deviation -enjoys two advantages over the standard and mean 
deviations: It is easiei* to calculate and clearer in meaning. 
But, since it has no simple algebraical properties and is liable 
to be erratic, it is not good for any but the most elementary 
statistical work, where only a rough estimate is desired. It is, 
however, not unsatisfactory when the distribution of values 
in a series is fairly symmetrical. If the distribution lacks 
symmetry and there are great differences in frequency between 
successive values of items in the series, it is better to select 
measures which give each value its due weight. Such measures 
are the mean deviation and the standard deviation. Mean 
deviation is less troublesome to calculate than the standard 
deviation, but cannot be used for further mathematical opera- 
tions. If in a given problem median suits the best, mean 
deviation would be a good measure of dispersion. But, as 
arithmetic mean is the most commonly used average, standard 
deviation, which is invariably computed from the arithmetic 
average, is the most important and the best measure of disper- 
sion. More so, because it is the least erratic, and is suitable 
for further algebraic manipulation. Its use is, therefore, re- 
commended for cases where positive and comparatively pr(‘cise 
results are desired. 

Absolute and illative Measures of Dispersion. 

Two or more groups may be compared by stating iheir 
respective means and absolute measures of dispersion pi ovided 
the means of the groups do not vary greatly in size and 
the groups are measured in the same units. If the 
difference between the means is great, it is safer to compare 
their relative measures of dispersion, f.e., co-efficients, rather 
than absolute measures. For example, if the production of a 



DISPERSION AND SKEWNESS 


187 


commodity in factory A for three siiccesi^e years be 1250, 
2000 and 2750 units respectively, and that of factory B for 
the same period runs in the order 3250, 4000 and 4750 units, 
the mean deviation, 500 units, is exactly the same in both 
cases. Similarly are the ranges and the st^indard deviations 
identical. If only these absolute measures are compared a 
fallacious conclusion that the variability of production in the 
two factories is the same might be drawn. But, when the mean 
co-eflScient of dispersion for the production of factory A, .25, 
is compared with the mean co-efficient of dispersion for 
factory B, .125, the degree of variability in the production of 
the two factories is made clear. According to the general 
rule that the lower the co-efficient of dispersion (mean, 
standard or quartile), the smaller is the variability of the 
series, or in other words, the greater is the consistency, pro- 
duction in factory A is more variable than that in factory B. 
Co-efficients of dispersion, therefore, correct the wrong im- 
pression created by the absolute measures. 

The absolute measures are concrete quantities, and should 
1)0 stated in terms of the units of the valuable (Rs., miles, 
inches, years etc.). It is impossible to compare dispersions 
in two universes measured in different units — say, one 
measured in Rs. and the other in yards — through absolute 
measures of dispersion. For this reason also, as already 
pointed out, co-efficients of dispersion are computed. They 
enable comparison between universes of different characters, 
since they are pure numbers. 


Relation between Measures of Dispersion. 

Mean, standard and quartile deviations all measure the 
same property, viz,y dispersion; but they do it in different 
ways. There does not exist ?a perfectly definite relation 
between them. Yet, for moderately symmetrical distributions 
the following relations are approximately true : 


(1) Quartile Deviation= A (Standard Deviation) 



188 


statistics: theory and practice 


, (2) Mean Deviation = — (Standard Deviation) , when 

5 

standard deviation is measured from * the arithmetic 
average, a. 

(3) azh2a or aztSor shall cover a majority of the observa- 
tions of the group. 

The above relations are not likely to hold good in cases 
where the number of observations is comparatively small. 


Lorenz Curve. 

The graphic method can also be used for showing the 
dispersion of a group. The method adopted by Dr. Lorenz 
for the study of the distribution of wealth (the curw showing 
the distribution being designated after him as Lorenz Curve) 
is the best for the purpose. It is, in effect, a cumulative 
centage curve, combining the percentage of items under re- 
view with the percentage of the factor (say, wealth) distri- 



f’»8 I. 



DISPERSION AND SKEWNESS 


189 


buted among the items. In other words, it is an ogive curve* 
formed by cumulating the values of the items on each 
axis and reducing the values thus obtained to percent- 
ages of the total. Figure 1 above illustrates the construction 
of the curve. If wealth were equally distributed among the 
people, the curve would be a straight line, AB, connecting the 
two extremes of the scales. In practice, however, curves like 
a or b result. The less the distance between the curve showing 
actual distribution and the line of equal distribution, the 
greater is the homogeneity in the distribution of wealth or less 
is the dispersion. While, the farther the curve of actual 
distribution is from the line of equal distribution, the larger 
the percentage of poor people and greater the concentration 
of wealth in the hands of a few millionaires. It is evident 
that dispersion in b is greater than that in a. 

The Lorenz Curve does not yield a numerical measurement 
of disi)ersion and is, in this respect, inferior to measures of 
dispersion. But its merit is that it affords a picture of dis- 
persion at a glance. It should be used along with a co-efficient 
of dispersion when a detailed study of dispersion is desired^ 
Lorenz Curve is very useful in such studies as the distribution 
of land, wages and income among the population of a country 
or the distribution of profits over different groups of 
businesses. 

Practical Utility of Measures of Dispersion. 

Measures of dispersion, it has already been indicated, 
supplement the information given by the mean. But 
they may also be computed for estimating the value of the 
series itself. For instance, the dispersion of time series 
affords a measure of the consistency or variability of the 
phenomenon to which the series relates. Determination of the 
degree of variability is, at times, very important in politico* 
economic problems. Violent fluctuations in production or 
trade naturally give a shock to the economic organism, which 


*See Chapter XV. 



190 


statistics: theory and pracitce 


affects many people. Measures of dispersion enlighten us on 
the . degree of these variations. Measures of dispersion are 
invaluable in the study of such problems as inequalities of 
income in a country, distribution of land among different 
units of agricultural community, wage fluctuations etc. They 
enable comparison between different groups of phenomena, 
which is an important function that the Science of Statistics 
has to perform. 


SKEWNESS 

Skewness denotes the opposite of symmetry. As applied 
to frequency distribution it indicates that the distribution of 
items in it is not synunetiical. Skewness relates to the shape 
of the curve of a frequency distribution. 

Tests of Skewness. 

The presence of skewness or a.symmetry in a given series 
can be tested in several ways. It is shown when the mode, 
median and mean do not coincide, it is further .shown when 
the sum of the positive deviations from the median is not equal 
to the sum of the negative deviations, it is also found to be ex- 
isting when the quartiles, or pairs of deciles, are not equidis- 
tant from the median. It is also shown when at points of equal 
deviations on either side of the mode the figures are unequal. 
Lastly, if skewne.ss is pi*esent in a frequency distribution, its 
graph will not give the normal, bell-shaped, symmetrical form. 
Rather, the base would Ije stretched to a greater extent on om- 
side than on the othei'. In curves which are not far away 
from being symmetrical the median usually covers over two- 
thirds of the distance travelled by the arithmetic average from 
the mode (See Figure 2). Therefore, approximately, 

M=Z+-|-(«-Z) 

where M stands for median, Z for mode and a for mean. 



DISPKR8IGN AND SKEWNES8 


191 


In a skew curve mode, median and arithmetic average, normally 
occur in sequence, the last being pulled away the largest in 
the direction in which the curve is skewed. 

We shall apply the above tests to the following example. 


Table 26. Calculation of Mean, Median, and Mean and 
Standard Deviations, 


1 

2 

3 i 

4 

1 ^ 

o 

7 ! 

8 

I 9 

o ^ 

£ 

a» 

t; ? 

c 

or. ^ 

il-s 

’ S 'S c 

a 

> ctf ® 

'•s^l 

B 

^ ci 

ll 

oi 

u. 

1 se a 

3 o 

a; c 
2 • - 

3 

oi ^ 
!?■ 

O 

,«H 0- c 

B £ 

! ^ rz 

*H 3J 

O' 

^ ' 


X M-I ^ 

o.S 

X 

* £ 




' ew 


cn - 

o 

in 

/ 


mf 

d It) 

! /rtiii 


rf' 

fd- 





j df. 

fd,. 




8 

1 

1 

3 

i — ‘ 

. 4 

—4.1 

16.81 

! 16.81 

4 

2 

3 

8 

I —3 

(i 

—3.1 

9.01 

19.22 

5 

i 3 

6 

15 

i *> 

6 

—2.1 

4.41 

13.23 

<) 

T) 

11 

i 30 


5 

-1.1 

1.21 

6.05 

'j' 

8 

19 

i 50 

i 

0 

—0.1 

.01 

.08 

s 

6 

25 

' 48 

-1-1 

0 

0.9 

.SI 

4.86 

9 

4 ; 

1 29 i 

30 

' 

8 1 

1.9 

3.61 

14.44 

10 

2 

: 31 j 

20 

i +3 

<) 

2.9 

S.41 

16.82 

11 

1 

32 ! 

11 

1 

4 

3.9 

15.21 

15.21 


1 

71=32 ! 

i 

vm=227 

i -10 
; 4-10 

45 

—10.5 

4 9.(> 


vtr=106.72 








227 



First Test: 1 

'he arithmelic 

average is 


or 7.1, the 


median — value of 16.5th item — is 7, and the mode is also 7. 
The values of mean, median and mode, therefore, do not exactly 
coineide. The eurv'c is not perfectly symmetrical. 

Second Test: Deviations from the median in column 5 
show fine symmetry. The negative sum of deviations, — 10, is 
equal to the positive sum, -f 10, leading us to think that the 
curve is symmetrical. 

Third Test: A/ = 8 — 7 = 1 ; 7 — 6 = 1. The tw’o 

quartiles happen to be equi-distant from the median. There- 
fore, the curve of the series appears to have a symmetrical 
form. 



192 


statistics: theory and practice 


Fourth Test : W-e take equal deviations on either side of 
the mode and compare the frequencies centering round them. 
6 and 8 are equidistant from the mode which is 7. But the 
frequencies against 6 are 5 and against 8 are 6. Therefore, 
the series is not perfectly symmetrical. 

The first and the fourth tests indicate that the shape of 
the curve of the given frequency distribution lacks symmetry ; 
while, the second and the third tests show that the curve 
might be symmetrical. The second and the third tests do 
not always provide a correct answer. We may, therefore, 
conclude that the curve of the given series falls slightly short 
of perfect symmetry. "By how much does the curve lack 
symmetry'? may be the natural question. For answering it, 
we must reduce skewness to numerical quantity. This brings 
us to the measures of skewness. 

Heasures of Skewness. 

The first measure of skewness, and the simplest too, is 
based on the fact that in a skew distribution the mean and 
the mode are drawn widely apart. The larger the distance 
that the mean {a) is pulled beyond the mode (Z), the greater 
is the degree of skewness. The second measure of skewness 
is b<'ised on the fact that in a skew curve the median does not 
lie half-way between the quartiles, the quartile nearest to 
the stretched base being pulled in that direction more than 
the other quartile. But these measures of skewness should 
be reduced to Co-efiteients of skewness for just the same reasons 
as measures of dispersion are reduced to oo-efi&eients. In 
computing the co-efficients of dispersion the measures of 
absolute dispersion are divided by the average used. Average 
would not be the proper divisor here, for the question now is 
not how much the curve is asymmetrical in proportion to values 
of the items of the series, but how much more the items on 
one side deviate than they do on the other. Therefore, 
measures of skewness shall be divided by the related measures 



DISPERSION AND SKEWNESS 


193 


of dispersion. On these principles are based the measures and 
coefficients of skewness discussed below. 

First Measure and (To-efficient of Skewness. 

Measure of Skewness = a— Z, i.e., the difference between 
mean and mode, 

Co-efficient of Skewness or / = ^ ^ (A) 


or; = 


And if the mode is ill-defined, the numerator may be substi- 
tuted by the difference between mean and median, so that, 

;= 7 “ (C) 

Karl Pearson has given a formula in which standard deviation 
is employed as the denominator, rather than mean deviation 
used above, so that, 

a—Z 

i=-V 

And if the mode is ill-defined, then on the basis of the relation- 
ship between median, mode and mean pointed out above®, 
formula (D) may be modified as below: 

. 3 (a-M) 

;= — — — (E) 

or 

The above co-efficients yield a pure number, and are, there- 
fore, independent of the units in which the variable is 
measured, and secondly they shall give a zero for symmetrical 
series. Tliese are the two properties which a good measure 
of skewness should possess. It is why the above methods of 
measuring skewness are termed ideal or standard methods. 

According to these formulae the following are the results 
of our series. The measure of skewness=a~-Z=7.1-“7= +.1 


= +.071 


= 77 : 7 ;^ = + .0702 


See page 190. 
F.— 13 



194 


STATICTICS: THEORY AND PRACTICE 


= + .071 


: + .0548 




= +.1644 


In theory there is no limit to the values of co-effieients 
yielded by the formulae A, B» C, & D, and this is a slight 
drawback. In practice the results are rarely very high, and 
fo^noderately asymmetrical curves they are usually less„ than 
unity. The co-efficient yielded by the formula E lies between 
the limits —3 and +3. In practice it rarely approaches that 
limit. 

Since none of the above formulae yields zero co-efficient of 
skewness for our series, we are led to conclude that the curve 
is skew. But, since the value of the co-efficient is very small, 
we may add that the degree of skewness is very slight. 
Skewness, it should be noted, is positive in this case. i.r.. mean 
is greater than mode. 

Second Measure and Co-efficient of Skewness. 

The measure of skewness= 

= 93 + ^1~2M, 

where ^3 stands for upper quartile, for lower quartile and 
M for median. 

The Co-efficient of Skewness, or I ~ ..I 1 

Q,-Q, 

This co-efficient is also a pure numl)cr, and is zero for 
symjnetrical distributions. Its result varies from —1 to +1. 
The fact whether the particular value of a oo-efficietit is 
significant or not is a matter of experience. It may. however, 
be suggested that .1 denotes a moderate degree, and .3 a 
considerable degree of skewness. 

According to the above formulae, in our series, table 26, 
Measure of Skewness =^3 4- ^i—2M 
=8+6-14 
=0 



DISPERSION AND SKEWNESS 


195 


Co-efficient of Skewness = 


Qz'^Qi 

0 

’2 


=0 


This measure of skewness and co-effieient suggest that the 
curve of the frequency distribution in table 26 is symmetrical — 
a suggestion not warranted by the first measure and co-efficient 
of skewness. This is the weakness of the second measure and 
co-efficient. It is the same weakness as that possessed by 
quartile co-efficient of dispersion — ^that is, it fails to take into 
account the size of the extreme variations, since it is concerned 
only with the quartiles and the median. This limitation should 
be clearly borne in mind before results yielded by this formula 
are relied upon. This co-efficient is simple and easy to calcu- 
late and is sufficiently reliable in those studies in w’hieh 
extreme instances are considered unimportant. It is a rather 
rough-and-ready measure and might be used w^here quartile 
deviation is being used as a measure of dispersion. Where 
comparatively greater accuracy is required, Karl Pearson’s 
Co-efficient of Skewness should be employed. 


Positive and Negative Skewness. 

Skewness can be positive as w^ell as negative. If the arith- 
metic average is greater than the mode or the median, skew’- 
ness is positive. If it is less, skew- ness is negative. WTien 
skewness is positive, mean would travel to the right of the 
mode in the curve. It would, be to the left of the mode, w^hen 
skewness is negative. In our example, the answer is positive. 
Positive and negative skewness are symbolised by plus and 
minus signs respectively. 

Figure 2 illustrates slight positive skewness. It also shows 
the positions of mode,* median and arithmetic mean, indicated 
by Z, A/, and a respectively, in an ideal moderately asymmetri- 
cal distribution : the median travels about 2/3rds the distance 
travelled by the mean from the mode. It is also clear that 



196 


statistics: theory and practice 


mode, median and mean occur in sequence in a skew curve, 
and the mean is pulled away the largest in the direction in 
which the curve is skewed. It may also be noted that the 
median bisects the area of the curve (called histogram). 



SIZE OF ITEM 
rif.2. 

Mode will remain unaffected by the addition of a few more 
items, but the median and the mean will be deflected. 
Dispersion and Skewness contrasted. 

Measures and oo-efl5cients of dispersion, respectively, indi- 
cate the absolute and relative differences between the indivi- 
dual items of the series and an average taken as the standard. 
They do not, however, show the extent to which deviations 
cluster above or below the average selected. 

Measures of skewness, on the other hand, show the extent 
to which distributions are pulled away, or distorted, from the 
ideal, symmetrical curve. In a symmetrical curve mode, 
median and mean coincide; in an unsymmetrical curve they do 
not. Measures of skewness have two functions to perform: 
firstly, they indicate the direction of asymmetry through their 
positive and negative character; secondly, they measure the 
amount of asymmetry in absolute or relative terms through 
the value obtained for the measure or the co-efficient. 

The theory of skewness is more important in biological 



DISPERSION AND SKEWNESS 


197 


studies and other studies dei>ending more or less upon the 
laboratory experiments than in economic and social investiga- 
tions. In social and economic inquiries a perfectly symmetri- 
cal distribution is an exception, and a large degree of skewness 
is generally expected. 

It is interesting to note the important part that median 
and quartiles play in statistics. The three characteristics of 
a group can be studied simply through them: the median 
locating the central position, quartile deviation showing dis- 
persion, and the second measure of skewness showing skew- 
ness. It may again be noted that skewness relates to the 
shape of the curve rather than to its size. 

EXERCISES 

(1) What do you understand by dispersion.^ Explain the 
various methods of its measurement and point out their advantages 
and disadvantages. 

(B. Com., Luck., 1930). 

(2) Describe carefully how Mean Deviation, Standard Devia- 
tion and Quartile deviation of any given distribution are obtained. 
In what problems should each be used.^ 

(3) What is Skewness? How would you find it in a non- 
symmetrical distribution? Distinguish between positive and nega- 
tive skewness. 

(4) What is meant by Skewness? How does it differ from 
Dispersion? Wliat is the object of measuring these? 

(B. Com., Alld., 1943). 

(5) Distinguish between absolute and relative measures of 
dispersion. Why are the latter computed? 

((5) Write short notes on — 

Range, First Moment and Second Moment of Dispersion, 
Standard deviation, and Quartile deviation. 

(7) Describe the methods of calculating the Standard Devia- 
tion and state the relationship between it and the Mean Devia- 
tion for a moderately asymmetrical distribution. 

(8) What do you understand by Modulus. Variance and Co- 
efficient of Variation? Give their formulae. 

(9) Explain with the help of specimen curves 

^ (a) Lorenz Curve, (6) Moderately asymmetrical curve. 



198 


statistics: thkorv and practice 


Point out their salient features. 

^(10) Calculate the Mean and the Standard Coefficients of Dis- 
persion of the two series relating to marks in Economics and 
Politics, given in exercise 2S, chapter X. What light do the co- 
efficients throw on the variability of the series? 

ni) Find the Mean, Quartile and Standard deviations of the 
population of 36 cities of India given in exercise 20, chapter X. 

✓ (12) Find the coefficient of skewness of the series given in 
exercise 22, chapter X. What is the character of skewness — 
positive or negative? What does it imply? 

^ (13) From the data given iii exercise 24, chapter X, compute 
the Coefficient of skewmess, and state what light the coefficient throws 
on the shape of the curve to be drawm from the data. 

^ (14?) Calculate the mean and standard deviations of the marks 

obtained by students in Class A and Class H, given in exercise 29, 
chapter X. State, w’hat you can. about the variability of the 
marks. 

(16) Find the standard deviation, and its coefficient for the 
frequency distribution given in exercise 36, chapter X. 

(16) Find the mean deviation and its coefficient for the data 
given in exercise 36, chapter X. 

(17) Draw a curve of the figures given in exercise 40, 
chapter X. Do you think it is a perfectly symmetrical curve? 
Verify your answer by finding the mode, median and mean. 

(18) Calculate the coefficient of mean deviation and the co- 
efficient of quartile deviation for the marks obtained by 30 
students, given in exercise 46, chapter X. 

(19) Calculate the range and its coefficient for the values of 
exports and imports given in exercise 43, chapter X. Comment 
on your result. 

(20) Calculate the coefficient of variation of the following 
monthly incomes of twenty families given below in rupees: — 

2,000; 35; 400; 16; 40; 1.500; 300; 6; 90 250; 20; 12; 

450; 10; 160; 8; 26; 30; 1.200; 60. 

(B. Com., Alld., 1941). 

(21) Find the Arithmetic Average, the First Moment of Dis- 
persion, and the Standard Deviation from the data in the follow- 
ing series: — 

Size of item F requency 

3 — 4 . . . . . . 3 

4— 6 . . . . . . 7 

5— 6 . . . . . . 22 



DISPEKSION AND SKEWNESS 


199 


Size of item 

6— 7 

7— 8 

8— 9 


Frequency 

60 

85 

32 


9 — 10 . • • • . . 8 

(B. Com., Alld., 1942). 

(22) The following table shows the number of workers in two 
factories whose weekly earnings are given in column (1). Determine 
the mean values of weekly earnings and standard deviation in both 
factories. 


Range of weekly 
earnings 

Number of workers in | 

Factory A 

Factory B 

4—6 

74 

71 

6—8 

376 

379 

8—10 

304 

303 

10—12 

no 

112 

12—14 

18 

18 

14—16 

0 

1 

IQ — 18 

9 

3 

18—20 

9 

9 

20 22 

0 

4 

Total 

900 

900 


(M.A„ Cal., 1936). 

(23) Calculate the mean deviation from the following data. 
What light does it throw on the social conditions of the community.^ 
Difference in acje heUceen husband and wife in a particular 
community 

Difference in years Frequency 


0 — 5 . . . . . . 449 

5—10 . . . . . . 705 

10—15 .. .. .. 507 

15 — 20 . . . . . . 281 

20 — 25 . . . . . . 109 

25—30 . . . . . . 52 

30 — 35 . . . . . . 16 

35—40 . . . . . . 4 


(B. Com., Bombay, 1936). 
, (24) What is a coefficient of dispersion? Find the mean. 




200 


statistics: theorv and practice 


standard aifd quartile deviations from tlie following figures, and 
comment. 


Number of persons 


Height in inches 

Group A 

Group B 

57 

8 

13 

58 

18 

20 

59 

30 

32 

60 

42 

35 

61 

35 

33 

62 

28 

22 

63 

16 

20 

64 

8 

10 


(25) Calculate the mean and the standard deviations of the 

following figures and state the percentage of eases which lie outside 
the mean at distances dtcr, zb2cT, M'here a stands for the 

standard deviation. 

115, 117, 121, 125, 116, 120, 118, 117, 119, 116, 

122, 124, 123, 118, 120, 118, 126, 127, 122, 125. 

(26) The following table gives the exports of some commo- 
dities from India: . 

1937-38 1938-39 1939-40 1940-41 1941-42 

Exports of Pig Iron 

(000 tons) .. 629 515 572 596 522 

Exports of Raw Cotton 

(000 bales) .. 2730 2703 2948 2168 1438 

Exports of Cotton goods 

(million yds.) , . 241 177 221 390 779 

Which of the above exports is most variable from year to year? 

(27) Summary of Receipts and Passengers of a certain 

Motor Bus Co. 


Year 

Receipts 

Passengers 

1925 

2,354 

50,010 

1926 

2,780 

61,060 

1927 

3,011 

70,005 

1928 

3,020 

70,110 

1929 

3,541 

83,001 

1930 

4,150 

91,100 

1931* 

5,000 

100,000 


* The figures for 1931 are mere estimates. 

From the foregoing data, find out one measure of dispersion, 



DISPERSION AND SKEWNESS 


201 


and state whether the v ariation in receipts is greater than that in 
passengers. 

(B. Com., AIM., 1932). 

(28) Calculate the Standard Deviation of the following data 
with regard to 2,298 families in the U. K. 


Number of persons in the I i : o ' o 1 j 1 
family T 1 ^ ® T I 

5 6 7 

8 

9 

10 

11 

i 

It; 

Total 

: i ( M 

Number of families il65|552{580i4S3 

268; us' 77 

41 

j20 

8 

5 

1 1 

i 

2298 


(M.A., Alld., 1942). 


(29) The 
locality : — 

following are the 

rents of 18 houses in a certain 


Rs. 

A. 

Rs. A. 


6 

8 

(> 4 


5 

0 

3 0 


5 

4 

9 0 


5 

8 

4 8 


5 

4 

4 0 


4 

12 

5 0 


4 

0 

3 12 


5 

0 

5 0 


4 

8 

3 0 

Calculate 

the 

mean deviation 

of this group. 

(B. Com., Luck., 1930). 

(30) The following table gives the number of finished articles 


turned out per day by different numl)ers of workers in a factory. 
Find the mean value and ‘ standard deviation ' of the daily out- 
put of finished articles, and explain the significance of ‘ standard 
deviation ’ — 


Number of 

Number of 

Number of 

N umber of 

articles 

workers 

articles . 

workers 

18 

3 

23 i 

17 

19 

7 

24 i 

13 

20 

11 

25 

8 

21 

14 

! 26 1 

5 

22 

18 

i 27 1 

4 


(B. Com., Cal., 1987), 






202 


STAJ ISTICS; THEORY AND PRACTICE 


(.31) Write short notes on 

(1) Dispersion 

(2) Standard deviation. 


Calculate the standard deviation from the following data: 
Size of item Frequency 


6 

7 

8 
9 

10 

11 

12 


3 

0 


9 


1.3 

8 

5 


4 


Total 


48 


(B. Com., Bombay, 19.30). 



CHAPTER Xn . 

INDEX NUMBERS 

\/ 

An Index Humber is a number whidi indicates the level 
of a certain phenomenon at any given date in comparison 'with 
the level of the same phenomenon at some standard date. 

It offers a device for estimating the relative changes of a 
variable in eases where measurement of its actual changes is 
inconvenient or impossible. If we want to measure the 
changes from one period to another in a factor, the change 
may not be capable of direct measurement. But, an evidence 
of it may be had from a measurement of the quantities 
influenced by the factor under consideration. These measure- 
ments may be expressed in different units. Therefore, their 
movements will not be directly comparable. To make them 
comparable, we may reduce the changes to a common deno- 
minator. \Ve may, therefore, express them as percentages of 
similar measurements for a selected date. When so expre.ssed, 
the percentages shall form a group. Each one of these per- 
centages shall throw .some light on the incommensurable, 
hidden, factor about which we desire information. If we 
take an average of all these percentages, it would afford an 
approximate idea of the change in the factor in question. 
This average is called Index Number. Thus, averages linked 
with percentages constitute the whole basis upon which is 
raised the superstructure of a simple device of comparing 
factors which are not directly comparable. 

Let iis take an example. Suppose we are concerned with 
measuring the general changes in the price level. These 
changes are not directly measurable. Evidence of them can 
be seen in changes in the prices of different commodities. 

203 



204 


statistics: theory and practice 


Quotations of these prices shall be available in different units, 
e.g., of wheat in Rs. per maund, of cotton in Rs. per bale, 
of petrol in Rs. per gallon. They are not directly comparable. 
If we plot them on a graph paper, reliable conclusions with 
regard to their movements will hardly be di-awn. To make 
^ them comparable, we may express them as percentages of 
■ corresponding prices of some selected date. These percent- 
ages, relating to different commodities for the same date, shall 
constitute a group. Each of these percentages shall reflect, 
in one way or the other, the change that has taken place in 
the general price level. When we compute an average of 
thi^e percentages, the resulting average would show an 
approximate general change in the level of prices from the 
standard date to the date under consideration. This average 
is called the Wholesale Price Index Number, or the General 
Index Number. “ 

Index numbers are not used for measuring changes in 
general level of prices alone. They are as well employed to 
/ measure movements of wages, employment, cost of living, sales, 
I production, investment, business activity, shares and stocks 
: and a multitude of other phenomena over a period of time. 
"In fact, where an attempt is made to bring to light what is 
enshrouded in complex variations of the items of a time 
series, they are invaluable to use. Movement in prices is a 
matter of general economic interest. To a layman a rupee is 
just a rupee, but changes in its purchasing power are very 
often a nuisance. The technique of index numbers is, partly 
for this reason, generally studied in connection with prices. 

Fluctuatioiis in General Price Level. 

Prices of commodities fluctuate very often. When the 
price of a single commodity changes, the reason for it may be 
found in a change in supply of that comniodity without a 
corresponding change in its demand or in change in its 
demand without a corresponding change in its supply. 



INDEX NUMBEKS 


203 


And, if the price of a commodity which is a substitute 
for another changes, the reason for it may, in addition, 
be attributed to a change in the supply or demand of 
the substitute. But, when the prices of two, or a number 
of, unrelated commodities, say brassware and cloth, rise or 
fall together, several reasons may be advanced to explain the 
situation, but the real one may lie in an alteration in the 
measuring rod itself. This measuring rod is money, and its 
value, unlike that of other measuring instruments such as the 
foot-scale, the maund and the yard-stick, changes with supply 
of and demand for it. Change in the value of money implies 
change^in prices. According to the Quantity Theory of 
Money, a fall in the value of monej’^ is the same thing as a 
rise in the general level of prices, or vice versa. But, if change 
in the value of money were the only cause of change in prices, 
the prices of all commodities would show nearly the same 
proportional rise or fall, and, therefore, it would be easy to 
say in what direction the value of money or its purchasing 
power moved by looking at the change in the price of any 
single commodity. As we know, change in the value of money 
is only one of the several causes, and it may be so intermingled 
with other causes like fluctuations of demand for and supply 
of goods that it may be difficult to say to what extent the 
value of money has changed. This complex phenomenon is j 
simplified through the device of Price Index Numbers. 


CONSTRUCTION OF INDEX NUMBERS OF PRICES. 

We have seen the necessity of constructing Index Numbers 
of Prices. The technique of their construction, or for the 
matter of that, of the construction of any index number, in- 
volves the following major problems: — 

I. The selection of items to be included, their 
number and quotations. 

II. The choice of the base period. 



206 


STATISTICS: THEOKY AND PRACTICE 


III. The type of average to be used. 

IV. The system of weighting to be adopted. 

Selection of Items. 

The general index number is based on the prices of com- 
modities exchanged in the market. But, it is neither possible 
to include nor to obtain regular price quotations of all the 
commodities that are bought and sold in various markets of 
a country. Therefore, the number of commodities on whose 
prices the general index number is based has to be brought 
down to a manageable limit. That is, sample data have to be 
used. The commodities selected for the purpose should be: 

(a) Repre^ntative of the tastes, habits and customs 

of the people. 

(b) Easily recognis.ible, and 

(c) Unlikely to vary in quality. 

These restrictions would not allow a large number of manu- 
factured goods to enter the index number, since they vary 
in quality. Nor will personal services be included, since they 
are not represented by tangible goods and can be measured 
in none but the monetary standard. Reliable quotations, 
however, of foodstuffs, raw materials and semi-manufactured 
goods are usually available. Therefore, the choice of items 
on which price index numbers are generally based is restricted 
to these commodities. And, even of these commodities those 
whose qualities and descriptions are standardized are commonly 
selected. 

The next question that arises is; ‘ what number of items 
should be induided?’ There is no hard and fast rule for 
deciding the number. In fact, the larger the number of items 
the better would be the random sample and the greater would 
be the tendency for errors to comi)ensate one another. But 
it should also be noted that complications, expense and delay 
in constructing the general index number increase with 
increase in the number of items. Therefore, a reasonable 



INDEX NUMBERS 


207 


number of items consistent with economy^ ..simplicity and 
accuracy of coiistmction should be taken. In India, the 
Calcutta Wholesale Price Index Number includes 72 items, and 
the Bombay Wholesale Price Index Number 40. In Britain, the 
Board of Trade Wholesale Price Index includes 150 items, the 
Economist and the Statist Wholesale Price Indices include, 
respectively, 58 and 45 items. In America, a wider range of 
quotations is in use : the US. Bureau of Labour Statistics' Index 
of Wholesale Prices includes 450 items and Duns Index 200. 

With these indices may be contrasted the so-called 
sensitive index numbers which are based upon a smaller 
number of items (say 20) supposed to be specially sensitive 
to fluctuations in business conditions. ‘ Index numbers of 
weekly wholesale prices of certain articles in India based on 
23 commodities, compiled by the Economic Adviser to the Govern- 
ment of India, the index of 15 primary commodities compiled 
l)y the Bank of England, and the index of 20 ]*asie commodities 
compiled by the Federal Reserve Board of Neiv York may be 
cited as examples. 

When the commodities have I)een selected and their 
numl)er decided, the next task is to make arrangements for 
obtaining regular quotations of prices. Quotations may be 
had from standard trade journals or from leading husinessmeii 
dealing in the commodities selected for the index. Great 
caution is required in selecting reliable trade journals and 
dependable business houses. We have seen in chapter VIIB 
why quotations of prices pu))lished in India are not fully 
reliable. If the agency of correspondent businessmen is to be 
employed, leading businessmen of one town alone wdll not be 
representative of the entii*e country. Nor is it feasible to 
have quotations for a commodity from leading dealers of all 
the towns of the count I’y. Therefore, representative places?, 
fi'om which quotations w^ould be obtained will have to be 
selected. That is, a sample of towns will be taken. The 


* Page 70. 



208 


statistics: theory and practice 


criterion would be to select those places where a given com- 
mcdity is bought and sold in large quantities. It is not 
necessary that quotations for all the selected commodities 
should be obtained from the same plaee, but it would be eco- 
nomical to get quotations for as many representative commodi- 
ties from one town as possible. 

After obtaining a sample of towns it would be necessary 
to have a sample of the leading dealers of the selected towns 
for obvious reasons. 

How diould prices be quoted? Prices should be quoted as 
so much money per unit of a commodity (c.g., Rs. 5-8-0 per 
maund) and not as so much units of commodity i>er unit of 
money (e.g., 7 seers 4 chattaks per rupee). The former are 
called money prices and the latter quantity prices. Before 
1907, prices in ^ Prices and Wages in India ’ published by the 
Government were expressed as seers and fractions of a seer 
per rupee. Use of such quantity prices needs proper care 
and caution. Money prices vary inversely with quantity prices 
and the percentage rise and fall also varies in the two nota- 
tions, Thus, if rice sells at 4 seers per rupee and later changes to 
2 seers per rupee, the quantity price would be said to show a 
fall from 100% to 50%. But the money price for rice would 
change from Rs. 10 per maund to Rs. 20 per maund. It would 
show a rise from 100% to 200%. 

It need hardly be emphasized that the quotations slumld 
relate to wholesale prices and not to retail prices, if changes 
in the general level of prices are to be measured. Wholesale 
prices are far more uniform over a given region for the same 
day and are more sensitive to the slightest changes in the 
conditions of demand and supply than the retail prices. 
Wholesale prices are, therefore, a better guide for disclosing 
movements of economic forces that effect and determine prices 
than retail ones. Retail prices, as a matter of fact, lag behind 
wholesale^ prices both in their rise and fall and they also 
fluctuate between narrower limits. 



INDEX NUMBERS 


209 


While making arrangements for obtaining price quota* 
lions the quality of the commodity should be correctly spewed, 

otherwise prices of different qualities of the same commodities 
may be quoted from different places at the same time, or from 
the same places at different times. If such is the ease the 
resulting inde:jj: would be a hotch-potch, incomparable, figure. 
To make matters easy, quotations of those commodities whose 
qualities and descriptions are standardised are taken. 

If it is found necessary to give special importance to a 
commodity, quotations for a few different qualities of the 
same commodity may be obtained. For example, if special 
importance is to be given to sugar, the prices of sugar bearing 
the trade descriptions of Marhowrah Crystal, Dobarrah, Java 
AVhite may be taken separately for each selected town. This 
would be one way of assigning w^eights to different commodi- 
ties in proportion to their importance. 

How often diould the quotatioins be obtained? Quotations 
can be obtained daily, weekly, or fortnightly depending upon 
the nature of the index numbea*. For a monthly index number !/ 
two quotations per week v/ould be quite adequate. 

A somewhat delicate problem arises when the price of an 
article is ' controlled ’ by the government, but illicit sales take 
place at imcontrolled prices. 

When price quotations have been obtained, they diould be 
averaged. The process would be to add up the prices for 
a commodity quoted from all the selected places on a particular 
date, and divide the summation by the number of places. 
The quotient would give the average price for that commodity 
for the country for the particular date, if daily prices art 
being used, or for the week, if weekly prices are being received. 
To calculate the average price for the month or the year, the 
procedure would be to summate weekly average or daily 
average prices of the same commodity and divide the sum,. fay 
the number of weeks or days as the csse^may If, however^ 
the index is to be based on the prices of only one town, its 
• F,— 14 



2^0 statistics: theory and pracitce 

wholesale prices would be used, and the necessity of striking 
an average of the prices for th« gauntry w'ould be avoided. 
The following table gives the average yearly wholesale prices 
of certain commodities in Cawnpore in rupees per maund. 


Table 27. Average Wholesale Prices of Certain Commodities 
in Cawnpore, 1928 — 1934, 


Lino 

Commodities 


Average Prices in Ks. Per Maund 

1928 

1929 

1930 

1931 

1932 

1933 

1934 

1 

Eice 

7.3 

7.7 

5.8 

4.1 

4.3 

4.1 

.3.7 

2 

Wheat 

7.7 

5.5 

3.6 

2.7 

3.4 

3.2 

2.8 

3 

Linseed 

7.0 

8.0 

6.5 

4.2 

3.5 

3.4 

3.6 

4 

Gur j 

6.5 

7.3 

6.2 : 

4.2 

j 3.5 

3.1 

4.1 

5 

Cotton 

34.1 

29.8 

17.3; 

13.3 

j 

! 14.8 

12.9 

13.2 

6 

Tobacco 1 

j 

17.3 

17.1 

14.5 1 

1 

11.6 j 

1 

4.9 

4.9 

5.7 


Choice of Base. 

The next step would be to reduce the average prices to 
relatives. For doing so, an appropriate base in terms of which 
the prices shall be expressed as percentages should be selected. 
Two methods are available for the purpose: 

I. Fixed Base. 

II. Chain Base. 

Fixed Base S^ethod. 

With the fixed base method either (i) the average’ 
price of some arbitrarily chosen year is taken as the 
base, or (ii) the average of the prices of a period of 
yeara is taken as the base, and the base chosen is adhered to 
for an indefinite time. In following the latter method either 
the prices for five or ten years may be averaged, or the prict^ 





INDEX NUMBERS 


an 


of the entire period for which index numbers are to be con- 
structed may be averaged. This method is useful when the 
data are reviewed at the expiry of a period of years ; but the 
former should be preferred if the data are to be made of a 
continuous character. If an arbitrary year is chosen as the 
base, it may happen to be an abnormal year, for instance, a 
year of labour unrest, of war or of financial crisis. Therefore, 
in selecting a base year the fact whether statistics of that year 
are reasonably normal should be specially considered. If 
an unusual year is taken as the base, the index numbers calcu- 
lated on it will have to be qualified v/ith a statement drawing 
attention to the abnormality of the base year. To avoid all 
this, a base period is often chosen. In averaging the prices 
of a group of years chances of abnormalities being present 
are reduced. Average of a period of years — ^rather, of the 
whole group of years to which the series of prices relate — is 
representative, is less affected by chance variations and is 
most generally applicable in statistics. In India, however, the 
wholesale price indices and most cost of living indices use a 
single year as base year. Only in some cost of living indices 
is the average of a few years taken as the base. 

The average price of the base chosen is taken as 100, and 
the price in each of the other years is expressed as a percent- 
age of this amount. Thus, 


price of a commodity for the current year 
price of the commodity for the base year 


XlOO 


will give the percentage (or price relative) for the current 
year. The percentage price of rice in 1930 on the basis of 


1928 is X 100 =79. This price relative is the index 

Rs. 7.3 

number for rice for 1930 with 1928 as the base. All the reU* 
tives in table Si, from line 1 to 6, have been computed in this 
manner with 1928 as the fixed base year. 



212 


statistics: theory and practice 


Table 28. Fixed-Base Relative Index Numbers of Wholesale 
Prices of Certain Commodities in Cawnpore, 
1928—1934. (1928=100) 




! Percentages or Relatives, 1928=100 . 1 

line 

Commodities 

1 







! 

1928 

1929 

1930 

1931 

1932 

1933 

1934 



1 

Rice 

100 

105 

79 

56 

59 

56 

51 

2 

Wheat 

100 

71 

47 

38 

44 

42 

36 

3 

Linseed 

100 

114 

93 

60 

50 

49 

51 

4 

Gur 

100 

112 

95 

65 

54 

48 

63 

5 

Cotton 

100 

87 

51 

39 

43 

38 

39 

6 

Tobacco 

100 

100 

84 

67 

28 

28 

33 

7 

Total of Rela- 
tives 

600 

589 

449 

325 

278 

261 

273 

8 

Average of Rela- 
tives 

100 

98 

i 

75 

54 

46 i 

44 

45 

9 

Median of Rela- I 
tives 

100 

102 

82 

58 

47 

45 

45 

10 

Geometric mean ! 
of Relatives j 

100 

97 

72 

53 

45 

42 

44 


Obain Baae Ketbod. 

In the fixed base method the base is fixed in the 
sense that the relatives for all the years are based on 
the prices of a single year (1928 in our example) or 
of an average of a period of years. Contrasted with it is the 
chain base or the shifting base method in which the relatives 
for each year are calculated upon the prices of the preceding 
year, and the results are chained together afterwards. Thus, 
the base year is not fixed, but changes from year to year. 
According to this method, in our example, we would express 
the 1929 figures as percentages of those for 1928, and get index 
numbers for the commodities for 1929 on 1928 as base; then, 
for 1930 we would express the 1930 figures as percentages of 





INDEX NUMBERS 


213 


those for 1929 and obtain index numbers (price relatives) for 
1930 on 1929 as base ; and so on. Thus the percentage (or link 
relative as it is called in the case of chain base method) for 

rice for 1929 is — X 100= 105, for 1930 is 100=75, 
Rs. 7.3 Rs. 7.7 

and so on. The link relatives in table 29 from line 1 to 6 are 
based on the preceding year, that is, the years are linked 
together. 

Table 29. Chain-ReUuiiv Index Numbers of Wholesale Prices 
of Certain Commodities in Cawnpore, 1928 — 1934. 
(1928=100). 



i 


Percentages or Relatives Based on. 1 





Preceding Year 



Line 

Commodities 








1928 

1929 : 

; 

1930 

1931 

1932 

1933 

1934 



1 

Rice 

too 

105 I 

75 

71 

105 

95 

90 

2 

Wheat 

100 

71 ' 

65 

75 

126 

94 

88 

3 

Linseed 

100 

114 

81 

65 

83 

97 

106 

4 

Gur 

100 

112 

85 

68 

83 

88 

132 

5 

Cotton 

100 

87 

58 

77 

111 

87 

102 

6 

1 Tobacco 1 

i i 

1 100 

i ' 

100 

■ 

85 

80 

42 

100 

116 

7 

1 Total of Link 
j Relatives 

j 600 

1 589 

449 

436 

550 

561 

634 

8 

Average Link 
Relatives 

j 100 

98 

75 

73 

! 92 

j 

94 

106 

9 

i 

Chain-Relatives 1 
(192^100) : 

^ loe 

98 1 

74 i 

1 54 1 

1 1 

49 

46 

49 


l|7pe of Averago to be used. 

The relatives arrived at by the fixed base method or the 
chain base method should be averaged to yield the required 
final index number. In theory, any form of average can be 
used for the purpose. In practice, however, we are to choose 
among (a) arithmetic average, (b) median and (c) geometric 
mean. Lines 8, 9 and 10 of table 28 give, respectively, the 






214 


statistics: theory and practice 


arithmetic average, median and geometric mean of price 
relatives computed according to fixed base method. These 
averages are the final index numbers of wholesale prices at 
Cawnpore for different years on 1928 as the base. 

Aritlimetic Meaii of Relatives in fixed base methodL 

The arithmetic average has the advantage of being readily 
intelligible, but suffers from a bias which it is not easy to re- 
move. It is too much affected by the extremes; it gives too 
much weight to increasing prices and too little to decreasing 
ones. The arithmetic average of relatives, as we shall just 
see, is not reversible. For all these reasons, the arithmetic 
average does not refiect the typical movement of prices. 

ISedian of Relatives in fixed base method. 

Median is the easiest to calculate, and enjoys an advantage 
over the arithmetic average in that it is but little affected by 
extreme items. Median is, therefore, very likely to be more 
typical of price movements than the mean. But it may not 
be possible to find an actual median, e.g., in table 28 the last 
six medians had to be interpolated. Besides, median may be 
erratic when the number of items is small. Again, the median 
of relatives is not reversible. Therefore, median too is not a 
suitable form of average. 

Geometric Mean of Relatives in fixed base method. 

The geometric mean is of value when items in a group are 
considered from the viewpoint of their relative differences 
rather than that of absolute differences. Therefore, it is reason- 
able to use it in computing index numbers where the items to 
be averaged are themselves relatives; It is indeed suitable for 
measuring the average ratio of change in prices for it gives 
equal importance to equal ratios of change. Fpr . initance, 
wheii^ gj^metric mean of relatives is taken, the effect of 
doub ling of one price is perfectly counterbalanced by the 
halving of another. This is not the case with arithmetic 
average or median. Similarly, if the price of one commodity 



INDEX NUMBERS 


215 


rises by 50% and that of another falls by 50%, the arithmetic 
average of relatives will neither rise nor fall implying that 
there is no change in the price level, while, in fact, both the 
prices show a change. The geometric mean of the rela- 
tives would, in this case,, show that there is a change in 
price. Table 30 illustrates these two ideas. 

Table 30. Fixed Base Index Numbers of X and Y Commodities 
, (1941=^100). 



1941 (base year) j 

1942 

1943 

Commodities 





Price 

Relative! Price 

Relative 

1 

Price j Relative 


1 

X 

i 

Rs. 

100 i 10 

200 

Rs. ' 

7.5 1 150 

V 

^ 1 

4 

100 , 2 J 

50 

2 , 50 

Total of Relatives j 

i 

j 

200 ! { 

, 1 

250 

200 

Geometric Mean of , 
Relatives 

' i 

100 ; 

100 I 

87 

Arithmetic Average 
of Relatives 

I 

1 

i 

100 

125 j 

1 100 


The price of X commodity is double that of 1941 in 1942, 
and of Y commodity is half that of 1941 in 1942. The arith- 
metic average index number for 1942 on 1941 is 125 implying a 
2o% rise in general price level ; but the geometric mean index^^ 
number corrects this impression by showing that there is no 
change in the level of prices in 1942 as compared with 1941. 
Again in 1943. price of X commodity has risen 50% over that 
in 1941 and of Y has fallen 50% below that in 1941. The 
arithmetic average index number for 1943 on the base 1941 
is 100 implying that there is no change in the level of 
prices, but the geometric mean corrects this impression by 
showing that the index number in 1943 falls to 86.7 or 87, 




216 


statistics: theory and practice 


From these examples it will be clear that the geometric mean, 
; through its property of giving more weight to small items and 
less to large ones, creates the effect of reducing the influence 
; of upward movements in prices and increasing that of down- 
ward movements. This property is of great value in tracing 
the course of prices. Geometric mean has the further advant- 
age that it makes possible the replacement of commodities 
which have ceased to be representative by those which have be- 
come representative without affecting the balance of the index. 
Yet another advantage of this mean is that index number 
calculated by using it is reversible, that is, a change of base 
year can be made without affecting the proportionate change 
in the general index. Geometric mean is, therefore, likely 
to be more typical of the changes in prices than are the arith- 
metic mean and the median. Its use in index number con- 
struction is growing, although arithmetic average has so far 
been largely used. 

Chain Relatives. 

In table 29 link relatives from line 1 to 6 have 
been calculated on the chain base method and totalled 
up in line 7. In line 8, average link relatives have 
been computed by dividing the totals in line 7 by 6, the 
number of commodities. These average link relatives have 
been placed in a chain in line 9 by using the arithmetic 
average. These chain relatives are the index numbei*s for 
different years on the chain base method in respect to the 
year 1928. The process of chaining together the link relatives 
is as follows: 

Average link relative for 1929 referred to 1928 is 98, 

Average link relative for 1930 referred to 1929 is 75, 

Average link relative for 1931 referred to 1930 is 73. 

98 

Then X 75 will give a chain relative index for 1930 on 1928 

100 



9t 


ff 




1931 on 1928. 



INDEX NUMBEBS 


217 


Further chaining of link relatives has been done in similar 
manner. 

Tbe charin base method has two advantages. Firstly, it 
enables a direct comparison between one year and the year 
succeeding it. This is far more useful in business and com- 
merce than the indirect comparison through a remote fixed 
base. Secondly, it makes possible the dropping of old items 
and inclusion of new ones, a necessity not infrequently felt 
when computing a series of index numbers over a period of 
time because of some commodities going out of use and new 
ones coming into fashion. 

Severdbility of Index Numbers. 

An important property, which an index number should 
possess, is its reversibility. Reversibility means that the index 
for the current year based upon the base year and the index 
for the base year based upon the current year should be 
reciprocal to each other. That is, the following equation should 
be satisfied: 

*10 

where, Poi stands for index for the current year on the base 
year omitting the factor 100 i.c., for price change in current 
year compared with base year), and Pio stands for index for 
the base year on the current year without the factor 100, t.e., 
for price change in base year compared with current year). 

The arithmetic average of relatives is not reversable. 
Line 3 of table 31 gives in column (d) the arithmetic average of 
relatives for commodities A and B for 1931 on 1930 as the 
base, and in column (e) the arithmetic average of relatives 
for 1930 based on 1931. These are, respectively, 130.5 and 

78.35 so that = = 1.305, and P.o= =.7835. 

100 100 

Now, 1.305 X.7835 = 1.02, which is greater than 1. There- 
fore, the arithmetic average of relatives is not reversible. 



218 statistics: theory and practice 


Table 31. .Testing the Reversibility of Index Numbers. 


Lind 

1 Commodities 

1 

1 Price 
in 

i930 

Priec ! 

in 1 — 
1931 j 

Year ]931_,„„ 
Year 1930^^®® 

1 Year! 930 

Year 1931^^®® 



(b) 

L 

(d) 

(e) 

1 

A ■ 

Bs. 

10 

Bs. 1 

15 1 

150 

66.7 

2 

B 

45 

50 ! 

1 

111 

90 

3 

Arithmetic 

average 



I 

130.5 I 

78.35 

4 

Geometric ' 
mean | 

1 

129 

i 

77.5 i 

1 


The geometric mean of relatives is reversible. Line 4 of 
table 31 gives in columns (d) and (e) the geometric average 
of relatives. These are 129 and 77.5 approximately, so that 

log 77 ^ 

Poi— =1.29, and Pio— — ^ — = .775. Their product is 

IQO 100 

1 (allowing for the adjustment of decimals). Therefore, the 
geometric mean of relatives is reversible. 

There is yet another way of looking at the reversibility of 
index numbers. If a relative shows an average increase of, 
say, 25 per cent from the base year to the current year, then 
this should. also be capable of being described as a decrease 
of 20 per cent from the current year to the base year. In 
table 31, using the arithmetic average we find that the level 
of prices in 1931 is 30.5 per cent higher over the prices of 
1930. We might, therefore, say that the prices of 1930 are 
lower than those of 1931 by 23.4 per cent of the latter; that 
is, if the index for 1931 on 1930 is 130.5, it should be 
(100— 23.4) =76.6 for 1930 on 1931, but actually it is 78.35 
as shown in column (e), line 3. Using the geometric mean we 
find that the level of prices in 1931 is 29 per cent higher over 
the prices of 1930. We might, therefore, say that the prices 



INDEX NUMBERS 


219 


of 1930 are lower than those of 1931 by 22.5 per cent of the 
latter; that is, if the index for 1931 on 1930 is 129, it should 
be (100—22.5) =77.5 for 1930 on 1931, which actually is the 
case as shown in column (e), line 4, table 31. 

It is clear, then, that geometric mean stands this test of 
effi^ciency and can. be said to perform satisfactorily its function, 
of showing the required change in the phenomenon under, 
study. Consequently, geometric mean is more suitable than 
the arithmetic mean or the median. Geometric mean can also 
be used with the chain base method. It is used by the Board 
of Trade in England in the construction of wholesale price 
indices on the chain base principle. We have used the arith- 
metic average in table 29. It is interesting to note that the 
geometric mean is used in the construction of ‘ Index numbers ' 
of wholesale prices of certain articles in India ^ and of the^ 
‘ Capital ’ Index of Indian Industrial Activity. 

Base shiftiiiig. 

It follows from the above that index numbers based 
on geometric mean of relatives can be shifted from 
base to base without error by w’hat may be called the 
' short-cut * method, illustrated in the above example. But 
it is not possible to shift the base uithout error by using the 
same method when arithmetic average has been used in 
averaging the relatives. This ‘ short-cut * method is, of 
course, not possible to apply when median is used in averaging 
the relatives. 

In addition to this short-cut method another method for 
base shifting is also available, viz., re-computing the relatives 
of each individual item on the new base and averaging their 
total, that is, reconstructing the entire series. This method 
should be used for shifting the base when arithmetic average 
and median have been employed in averaging the relatives of 
a series, while both, this and the short-cut, methods shall 
yield identical results when geometric mean has been used. 



220 


statistics: theory and practice 


The System of Weighting, 

The * ' unwdghted ’ * index is arbitrarily weighted. So far, 
in the construction of index numbers, we have used simple 
averages and no special assumption has been made concerning 
weights. Distinction is very often made between weighted and 
unweighted index numbers, but it should be noted that every 
index number is weighted in some form. In computing the 
simple average of relatives each relative is counted once. 
Therefore, apparently, weights are unity in each case. A 
further thought would reveal that the change in the price 
of a commodity from one date to another is related to the 
commodity’s price level on the first date. If in the base year 
the price of a commodity is unusually high, it will have an 
influence to correspond on the total, that is, it would have 
the same effect as actual weighting would. This can be 
easily verified by recalculating a given series of index numbers 
upon a few different bases by using the arithmetic average of 
relatives and then noticing that every fresh series differs, not 
only in the absolute values of the index numbers, which is 
immaterial, but also in the relative values of the indices, which 
is significant. That this would be so is evident from the fact 
that index numbers where simple arithmetic average is used 
are not reversible. We may, therefore, conclude that even 
the so-called unweighted index numbers are arbitrarily or 
haphazardly weighted, the arbitrary element being exercised 
by the choice or shifting of base year. We may also say that 
when simple average of relatives is used change in the base 
year is equivalent to change in weights. 

Implicit and ExpUdt Weighting^. 

In those index numbers which are termed ^ weighted ’ 
the weights are chosen according to some systematic plan. 
Weights may be implicit or explicit. ImpUat weiglito relate 
to the selection of commodities themselves. If a particular 
commodity, or a commodity of the same general class, is in- 
cluded, say, 3 times in the list of prices, the weight given to 



IXDEX NUMBERS 


221 


the commodito^ is 3. For instance, 3 varieties of sugar may 
be included. Varying emphasis is, thus, given to the different 
items while selecting the commodities by the number of times 
a given commodity is included in the selection. Many of the 
so-called unweighted index numbers may, in fact, be indirectly 
or implicitly weighted. For instance, the Calcutta and the 
Bombay Wholesale Price Index Numben^ are implicitly 
weighted. 

In assigning explicit weights, weights proportional to the 
relative importance of different items are used. But, what 
considerations determine this relative importance? This en- 
quiry is essential because weights should either be appropriate 
or they should not be used at all. Now, in constructing an 
index to show general changes in prices, the weights assignable 
to wholesale prices may be several, for instance, the quantity 
of goods placed on the market, value of goods produced, yaluejs * 
consumed, and so on. Different systems of weighting would 
yield different results. The difficulty, then, is: which of these 
or other similar criteria should be accepted as correct? This 
difficulty is not easily soluble. Therefore, it appears that 
weights may better be ignored. This idea is strengthened by 
the fact that weighted results are almost identicid with the 
unweighted ones, if weights are chosen according to chance. 

Nevertheless, the problem of selecting weights is one of 
practical concern. The merits of >veighted and unweighted 
indices can be understood only by comparing them. If a pro- 
perly weighted index agrees with an unweighted one, weij^ts 
may be disjpenscd with; if it does not, weights ought to be 
used. According to Bowley*, paucity of data might make the 
inclusion of weights necessary and the popular desire for con- 
crete measurements might make a fine show of weighting ex- 
pedient. Weighting seems necessary also because of tho 
Heterogeneous character of the series from which incUces are 
computed. Most wholesale price indices in the U.S.A. are 

*See Bowley, A.L., Elements of Statistics, 1920 ed., p. 206. 



229 


statistics: theory and practice 


weighted. Weighting is, indeed, essential in constructing 
cost of living and business activity indices, as we shall see 
later. 

Methods of Weighting. 

Two methods of explicit weighting may be distinguished : 
The Weighted Average of Relatives (Ratios) Method and the 
Weighted Aggregate of Actual Prices Method. The latter is 
known as the Aggregative Method also. 

Weighted Average of Relatives. — According to this method 
price relatives are weighted by values. Values are obtained 
by multiplying quantities with their respective prices. The 
sum of th«e products of price relatives of the current year and 
values of the base year divided by the sum of the weights 
gives the weighted arithmetic average of relatives, which is 
the required index number for the current year. Symbolically, 

Index Numbe/for Current Year= 

2F ^ 

where / stands for price relative and F for value (wei^t). 
Table 33 demonstrates tho working of this method. Weighted 
median and weighted geometric mean of the relatives may 
also be computed. 

Aggregative Method. — According to this method prices 
themselves are weighted by quantities, since total value is equal 
to prieeX quantity. The products of actual prices of the 
current year and quantities of the base year are summated. 
This sum is expressed as ratio in relation to a given base. This 
ratio is the index number for the current year. Symbolically, 

Index Niunber for Current Y ear = X 100 

Xpo 9o 

where pi stands for price in current year, 

. po stands for prioe in base year. 
qo stands for quantity in base year. 

Table 32 demonstrates the working of this method. 

In the above case the weights arc fixed. If quantities for 
aU the years for which it is desired to calculate the inchex 



INDEX NUMBERS ^28 

numbers are available, the weights may be made to vary from 
year to year, quantities for different years being used as 
weights for their respective years. Several formulae have 
been suggested for this purpose. We shall, however, confine 
ourselves to the Crossed Weight Formula given by Fisher, 
which is supposed to be highly satisfactory. 

Fisher’s '' Ideal ” Formula. 

Professor Irving Fisher^ after an elaborate examination 
of 134 possible formulae concluded that a scheme of cro^ 
weighting should be used, and gave a Crossed Weight Formula, 
which is also named as Fisher’s Ideal ” Formula. It isf: 

Spi gi i, 

Tills formula requires four sets of aggregates, viz„ 

(1) Xpi qo: Current year priceXbase year quantity, 

(2) Xpi qii Current year price X current year quantity, 

(3) Xpo Base year price X base year quantity. 

(4) Xpo qi: Base year priceXcurrent year. quantity’. 

The first aggregate is divided by the third, and the second 
by the fourth. The two resulting relatives are multiplied 
together and square root of the product is extracted. Pishar 
calls this formula as “ ideal ”, since it neutralizes the types 
of bias which are found in measuring prices and quantities. 
The system of weighting has been so designed in the formula 
that the resulting index satisfies two basic tests, viz,. Time 
Reversal Tes|[ and Factor Reversal Test. 

Time Reversal Test. — It has already been indicated in con- 
nection with the Reversibility of Index Numbers”^ what time 
reversal test implies. According to Fisher this test may be 
described as follows: 

‘‘ The test is that the formula for calculating an index 
number should be such that it will give the same ratio between 


•See Fisher, Irving, Making of Index Numbers, 1922. 
•See page 217. 



.224 statistics; theory and practice 

one point of comparison and the other, no matter which of the 
two is taken as base. 

** Or, putting it another way, the index number reckoned 
forward should be the reciprocal of that reckoned backward.''® 
This implies that the following equation should be satisfied t 
PoiXP.o=l 

This again, means that if an index shows that between 1938 
and 1942 prices doubled, then it should also show that the 
level of^ prices in 1938 was one-half of that in 1942 when 
measured from 1942. 

Factor Beversal Test — A second fundamental test by 
means of which good index numbers can be detected is the 
factor reversal test. Regarding this test Fisher says: 

“ Just as our formula should permit the interchange of 
the two times without giving inconsistent results, so it ought 
to permit interchanging the prices and quantities without 
giving inconsistent results — i.e., the two results multiplied 
together should give the true value ratio."® 

This implies that the following equation should hold good : 

Xpo (Jo V 

where, Poi stands for the price change for the current year 
on the base year, ^oi for the quantity change for the current 
year on the base year, piqi for the total value (prioeX quanti- 
ty) in the current year, p^qo for the total value in the base 

year, and ' for the ratio of the total value in the current 
qo 

year over the total value in the base year. 

Fisher's " Ideal " formula not only satisfies both the above 
tests, but is also simple and easy to calculate from the practi- 
cal point of view. Therefore, of the 134 possible formulae 
which Fisher analyzed, the ‘‘Ideal" is ideal. But this formula 
requires statistics of quantities for the base year as well as 
the current year. These statistics are generally not available 


• Fisher, Irving, Op. Cit., p. 64. 


•Fisher, Irving, Op. Cit., p. 72. 



INDEX NUMBERS 


225 


I’oi* (^very current yeai*. They may b-e available at each succes- 
sive census of ijroduclion, if such censuses are taken in a 
eouiitry. Therefore, the choice has to lie with the use of fixed 
weiglits, quantities of the base year or the year supposed 
to be tyiDJcal. 

Summary and General Remarks. 

The technique of constiaiction of price index numbers may 
i)e summarised as follows: — 

(1) Sele(*t a i*easonable number oL‘ representative com- 

modities. 

(2) Arrange for (»btainin^' their regular wholesale 

prices from 

(i) either, standard trade journals, 

[\h) ni*, leading <lealei-s of representative 
centres. 

('S) Average the price quotations, and obtain monthly 
or yearly average prices as the case may be. 

(4) K(Mluce the average prices to percentages, i.e., price 
relatives, on 

li) either, the fixed l)ase method, where 

(a) the fixed base may be a fixed year, or 

(b) it may be an averaae of a period of 

years, 

(ii) or, the chain base method, 
loj If th(' fixed ba.se method is iistd, compute a simple 
average of relatives, using the arithmetic 
aveiage. median or the geometric mean. Theore- 
tically, the advantage li(*s with tlie geometric 
mean. 

If the chain base method is used, chain together 
tlie link relatives. 

(G) Jf weighting is necessary, compute 

(i) either, the weighted average of relatives, 

• F.— 15 



STATIS'l'KS: rilEOUV AND 1»HACTICE 


22f) 


(ii) or, tW weighted agjircjiate ot* actual 
prices. 

Thus, we have discussed two important methods of 
constructing index numhei*s, viz„ the Avei*age of Relatives 
Method and the AVeighted Aggregale of Actual Prices Method. 
In the former method, 1h(‘ ave?*age may ]>e Sim])le or AVeighted. 

A compai'ison of tlie unweighled i]idex luimbers calculated 
on the fixed base method in table 28 and of the unweighted 
index numbers calculated on the chain base method in table 
29, and also of the weighted index numbers which can be 
calculated from the same figures would reveal that ditfei-ent 
methods yield different results, but all index numbers — with- 
out any exception — point in the same direction, 'riieiidoi e, 
an index ma^’ be relied upon so far as the tendency shown by 
it is coneerned without being trustwor’thy to tlie last digit. 
It is not the absolute value of an index number that matters. 
What matters is the general trend shown by it, oi* by a s(‘ries 
of index inimlier'.s, 

COST OF LIVING INDEX NUMBERS. 

Tin* methods of eight ing <lis(*uss(‘d above aro moi'<‘ 
particulai'ly used in the ouiistruetion of cost of living iiidex 
number's. Thes(* indices are designed to stiuly the (*fle(‘t of 
changes in prices on the people as ('onsumers, or, in Other 
words, to study the average increase in the cost of maintaining 
the .stamlard of living in a given year nii(‘hanged from that in 
the base year. (ierH*ral index number's fail to afil'oi'd ns an 
exact idea I'Cgarding tin* effect of the change in the general 
pi’ice level on the cost of living of diftVr(*nt classes of peo])le. 
since a given change in the level of pr*i(*es affects dijferent 
classes of people differently. Therefore, to obtain a mt*asn!'e 
of the gener'al nrovement of pr-ices of those com mo< lilies which 
enter into the consumption of different classes of ]>i‘opIe. 
Cost of hiving Index Numbers arc com])ilc<l. 

Difficulties in Constructing' Cost of Living Indices. 

Standar’d of living varies with income or* (»ccupatiom 



INDKX NIJMI5KRS 


227 


Therefore', one siii«^le cost of livini* index will not be truly 
representative of people of different ineoines. Consequently, 
index numbers should be compiled separately for different 
classes of people. P>ut standard of living also varies with 
region or place in which people reside. This difficulty can be 
solved by compiling’ index numbers separately for different 
localities or different homogeneous zones. Again, same classes 
t>f peoj)le at the same time do not s])end their income in exact- 
ly similai* ])ropoit i<ms on different objects. The best that can 
be done to obviate this <lifficulty is to collect a reUvSimable 
mimbei’ of sufficiently accural t‘ .samples of family budgets from 
th(' sanu' class of p(‘o])le to have a ^^^eneral idea of the propor- 
tions of (‘xi)enditure (ui different objects by an aierage family. 
Ami, yet there is aiiother difficulty that the same classes of 
|)(*ople at different times spend their income in varying pro- 
portions. change in the nature and quantity of commodi- 
ties (‘onsumed may arise from a (diange in taste oi* fashions, 
or from an iiicreased purci)ase of cheapening commodities and 
deci-easing consumption of things becoming dearer. Theso 
factoi's, indeed, go a long way in explaining the change in the 
cost of living. Hu\ these changes cannot be taken stock of 
every year without iueurring the huge expense of eondueting 
iresh family budget enquiries. For this reason, it is assumed 
that the qualities aiiil (|uaMtities of commodities consumed in 
the base year by a jmrtieular class of people remain the same 
jor au iiidehnit(‘ ptoaod. 'TIh'sc (jualities and quantities, there- 
lore, form the basis <d‘ the index number series. Another 
bictor that causes a change in the standard of living is the 
<‘liauge in the jmi*ehasing power of money. Cost of living 
ituiex number eonfiiHs itself to a measure of this fagtor alone. 
Further, ]>eople as consumers pay letail and not wholesale 
prices. Therefore, retail prices are taken into consideration 



228 


statistics: theory a^d practice 


ill constructing’ cost of living indices. But retail prices vary 
from locality to locality. If cost of living index numbers are 
computed separately for different classes and different regions, 
this difficulty of variation in retail prices is also got over. 

Construction of Cost of Living Index Numbers. 

The first step, therefore, that is taken for the constiuction 
of cost of living index number series is to decide the class of 
people — industrial workers, clerks, etc., — for which the index 
numbers are to be compiled. Next, a sample budget enquiry 
of the class concerned is made, the sample covering a reason- 
ably adequate number of families and conducted during a 
period reasonably free from abnormalities of very high or 
very low prices. This budget enquiry would give precise in- 
formation i*€garding (1) the nature, qualities and quantities 
of commodities consumed by the people classified under the 
heads of food, clothing, rent, lighting and fuel, and miscellane- 
ous groups, (2) the retail prices of the different commodities, 
(3) the proportion that the expenditure on each individual 
item of expenditure bears to the expenditure on the group 
to which it belongs, (4) the proportion which expenditure on 
each group bears to the total expenditure. This budget en- 
quiry forms the basis of the index number series. With it, the 
selection of commodities whose retail prices are to be regularly 
obtained becomes easy. It is important to note that a cost of 
living index numl>er should include only those commodities 
under the food, clothing, etc. groups which are generally usid 
by the class of people concerned, which are not subject to 
wide variations in quality nor to seasonal alterations in supply, 
and for which regular and comparable quotations of price® 
are obtainable. Retail price quotations should be obtained 
from the localities in which the class of people concerned . re- 



INDEX NUMBERS 


229 


side or from which they usually make their purchases. The 
sourcea of price quotations may be either standard trade 
journals, or publications, of government or municipalities, or 
typical businessmen in the locality concerned. From these 
regular price quotations average prices are calculated in the 
same manner as they are done in the case of general index 
numbers. 

To convei’t these average prices into index numbers the 
average pi*iees or their relatives must, of necessity, be 
weighted, l)ecause the average consumer is not recompensed 
for a rise in the price of, say, cotton cloth by an equal fall in 
that of cement. Different objects of his consumption have 
different importance in his budget. They must be assigned ' 
their relative importance. For this purpose one of the two 
systems of weighting may be applied: (1) The Aggregate 
Expenditure Mel hod and (2 ) The Family Budget Method. 

Aggregate Expenditure Method. 

This metho<l is the same as the Weighted Aggregate of 
Actual Prices Method already discussed. Table 32 demon- 
strates the calculation of cost of living index number for the 
artisan class in Eastern F.F. by this method. (The figures in 
the table are imaginary). Quantities consumed in the base 
year have been taken as weights for the current year. The 
quantities! consumed in the current year may be, and usually 
are, different from those consumed in the base year, but for the 
reason already indicated, viz,, the enormous cost involved in 
conducting a fresh budget enquii*y every year, fixed weights, 
i-e,, quantities consumed in the base year, have been, and are, 
used as weights. This is also why Fisher’s “ ideal ” method is 
difficult to follow in practice, for in using it quantities con- 
sumed in the current year should be known in addition to those 
consumed in the base year. 



230 


STA'J’ISJ'IC\S: 1HKOKY AND PRACTICE 


Table 32. Construction of Cost of IJiing Index Number by 
the Aggregate Expenditure Method, 


(1) 

(2) 

( 3 ) 

( 4 ) 

( 5 ) 

(6) 

( 7 ) 

1 

Quantities 

consumed 

in 

Unit j 

Price 
in base 

Price 
in cur- 
rent 

Aggregate 

Expenditure 

Aggregate 
Expenditure 
in current 

Article 

1 Base year 

1 (1925) 


year 

(1925) 

year 

(1941", 

in base year 
[cl. 2Xel. 4J 

! 

year 

[cl. 2Xcl. 5J 



1 

I 

Po 

i 

Ih 

i 

Po <h 

Pi do 




Rs. 1 

Rs. 

! Rs. 

! Rs. 

Rice 

1 5 mds. 

per maund 

6 ^ 

3 1 

1 30 

; 40 

Bajra, Jowar 

; 5 mds.. 


4 

5 

20 

25 

Wheat 

i 1 md. 

1 »» 

5 

10 

5 

! 10 

Gram 

1 md. 


3 

1 6 

3 

0 

Arhar 

.5 md. 


; 4 

1 6 

2 

: 3 

Other pulses 

2 mds. 

jy 

; 3 

- 4 

0 

i 8 

Ghee , 

4 seers I 

per seer 

1 1.25 

i 2 

5 

; 3 

Gur 1 

2 mds. 

per maund 

! 2.50 

j 5 

5 

10 

Salt 

1 12.5 seers 


! 4 

! 5 

1.25 

! 1.6 

Oil 

i 24 seers 

yy 

i 20 

i 25 

12 

• 15 

Clothing 

1 40 yds. 

per yard 

0.25 

i 0.5 

10 

! 20 

Firewood 

10 mds 

per maund 

0.50 

0.8 

5 

1 8 

Kerosene 

1 Tin 

per tin 

1 ** 

! 0 

4 

i ^ 

House-rent 


per house 

j 12 

i 15 

i 12 

i 


i 

1 


i 

i 

1 


|Sp„g^l20.25 


Index Number for tlie rurrent Year ( 1941 ) ” 



175^6 

120.25 


XlOO 


146 


Quantities consumed in any year (supposed to be typical) 
other than 1925 could also have been used as fixed weights. 
Similarly, figures proportional to quantities consumed could 
also have been used in place of the actual figiu-es. 


Family Budget Metibod. 

This method is the same as the Weighted Average of 





INDEX NUMDERS 


231 


Relatives Method ali*eady discussed. Tabl-e 33 demonstrates 
tlie (‘alculation of cost of livino* index number for the same 
artisan class of Eastern U.P. by this method using the same 
data. Values consumed in the base year have been used as 
weights for the current year. For the reason already indicated 
values in the current year are not used, and fixed weights are 
employed. 

; Table 33. Construcfwn oj oj Lilting Index Number by 


the Family Budget Method. 


(1) 

(2) 

(3) 

(4) 

(5) 

(6) 

(7) 



Price 
in base 

Price 

Price 

Relatives 

Weights 
(Values con- 

Product of 



in cur- 
rent 

Price rela- 
tives and 

Arti<^le 

Unit 

year 

(1925) 

year 

(1941) 

for cur- 
rent year. 

sumed in 
Base year) 

weights 
[cl. 5Xcl. 6J 



Vo 

- 

Pi 

Pi/PoX 

100=7 

V 

IV 



Ks. 

Rs. 


Rs. 


Rice 

[»er maund 

(5 

8 

irx:i 

30 

3999 

Bajra, Jowar 

,, 

4 

5 

125 

20 

2500 

Wheat 


5 

10 

200 

5 

1000 

Gram 



(> 

200 

1 3, 

000 

Arhar I 

i 

4 

i () 

150 

' 2 

300 

Other pulses; 

Jf ' 

i a 

: 4 

13:i.3 

0 

799.80 

IGhee 1 per seer 

, 1.2.“. 

' 2 

100 

5 

800 

Gur 

()er niauud 

1 2.5t; 

1 ^ 

200 i 

5 

' 1000 

Salt 


1 4 

! 5 

125 

1.25 

156.25 

Oil 


' 20 i 

1 25 

125 

12 

1500 

Clothing- 

per yard 

i 0.25 1 

1 0.5 

200 

10 

2000 

1 Firewood per maund 

0.50 , 

, 0.8 

100 

5 

800 

Kerosene 

per tin 

1 * 

> (> 

150 

4 

600 

House-rent 

per lionse 

1 r2 

i 

15 

125 i 

12 

1500 


. 


_ 

i 

iT= 120.251 

Eir= 

17555.05 


Index Number for tiie (Tirrent Yiaj- (1941)“*^ — 

SF 

_17555.0S 

“ 120.25 
“ 146. 




232 


statistics: theory and rRAcncE 


Values of any year other than 1925 could also have been 
used as weights. Figures proportional to actual values could 
also be, likewise, used. For instance, instead of using 30, 20, 
5 etc. as weights we could have divided each of them by 5 and 
used 6, 4, 1 etc. as weights. 

It will be seen that the cost of living index numbers by 
both, the aggregative and the family budget, metliods exactly 
agree. Indeed, they should, if the weights relate to the same 
year. Family Budget method, or AVeighted Average of 
Relatives method, is largely in use. 

In ta1)le 33, weighted average of all the articles has been 
directly computed. This process can be improved upon by 
(i) dividing the articles into food, clothing, etc. groups, (ii) 
weighting their pi*ice relatives by the proi)oi‘ti()ii which ex- 
penditui'c on each article bears to Ihe total expenditure on 
the group, (iii) obtaining weighted avei'age index foi* each 
coniiiKKlity group, (iv) weighting the group index numbers 
by the ])r()]>o?‘tion whicdi (‘xpenditurc on each grou]) bears to 
the total expenditure, and (v) obtaining the w('ighted average 
of the group index numl)ei*s. The tinal weighted average is 
the required cost of living index number. This procevss of 
double weighting is more scientific than that of coiiiputing 
a direct weighted average, and is in general use. In India, 
cost of living index numhers are computed by this method 
as will be seen in the next chapter. 

Errors in Cost of Living Index Numbers. 

The sources of error in cost of living index numbers 
lie in— 

(1) demarcating one class of people from another 

incorrectly. 

(2) the faulty seleetion of representative articles 

entering into the cost of living of the class of 
people concerned. 



INDEX NUMBERS 


233 


(3) the collection of price quotations. Information re- 

lating to clothing, for instance, relates, in part, 
to cloths rather than clothes simply because fairly 
steady and reliable prices for the latter are not 
available. 

(4) the faulty assignment of weights. Weights may be 

deliberately manipulated. 

(5) the changes in demand of various articles or their 

prices in the period under investigation. For 
instance, the budget enquiry for the construction 
of cost of living index numbers for industrial 
labour at Favvnpore extended over two decades 
viz., 1914 to 1934 during which period the price 
of firewood changed from annas per maund 
in 1914 to 13 annas in 1919 and to nearly 8 annas 
in 1930. 

Unsatisfactory Character of Cost of Living Index Numbers. 

Even if ei’rors of the types enumerated above are not 
allowed to enter into the construction of cost of living index 
number, the index is not fully dependable. The reason is 
simple. The total monthly expenditure of two families of the 
same class may be equal, but the distribution of the expenditure 
over different objects may considerably vary according to the 
number of persons in the family, their age, sex, religion, caste 
and mode of living, and also according to the rise and fall 
in the prices of articles consumed. Index number does not 
consider the variations in the expenditure of the individuals. 
It considers only average or normal cases. Therefore, an 
index number pointing to the change in the average cost of 
living cannot be applied to every individual case in the class. 
It should be taken only as a guide to the direction, the 
general trend, of the purchasing power of money for the 
class of people for whom it is meant. 



234 


statistics: thkory and practice 


Further, in constructing cost of living index we proceed 
on the assunipti-on that the quantities or the values consumed 
in the base year or the typical year do not change, wlvereas 
they actually do. Standard of living of the same family undei*- 
goes change as time elapses and as prices, tastes etc. change. 
This fact taken into account by the index. The objec- 

tion may be met by answering that the index num])er con- 
siders the increase in the cost of maintaining unchanged the 
base year standard of living. Ti*ue, but is thei-e a particular 
sanctity about the base year standard of living? The base 
year standard may not be adequate, rmprovement in it 
may be necessary. Therefore, to make tlie index ti-uly re- 
presentative, family budgets should be coibcted regularly 
after an interval of a few years and new weights adopted, and 
commodities and their qualities and quantities mo<lified in the 
light of every fre.sh enquiry for subsequent y(^nrs. 

INDICES OF INDUSTRIAL ACTIVITY. 

If it is desired to study the general change in the indus- 
trial activity in a country over a poi iod of time, evidence of 
this change may ]>c found in changes in tln^ output of the 
various industries of the eounti*y. The first ste]), therefore, 
will be to collect int\)rmation relating to the production of 
different groups of industries. Information may, for instance, 
be had for the following groups of industiaes: 

1. Mining — coal, iron ore, peti-oleum, gold, manganese. 

2. Metallui-gical — steel work, rolling mills, foundries etc. 

d. Mechanical — locomotives, shipbuilding, railway rolling 

stock etc. 

4. Textile — cotton, woollen, jute, silk etc. 

5. Industries usually subject to excise duties — distilling 
of alcoholic beverages, brewing, sugar, matches and tobacco etc. 



IXDKX NUMHKUS 


, 235 


6. Other important imliistries — eheiiiieal, cement, j>lass* 
ware. Hour-milling, oil-crush iiif*- etc. 

As the statistics of production of these industries are 
received from year to year, those of the base year are put 
down at 100 and those of the subsequent years expressed* as 
a percentage of the base year. These relatives are multiplied 
by weif>h1s assif^ned to them in proportion to the importance 
of the industries in the country. The weighted average — 
arithmetic or gtometric — of the relatives gives the index 
num))er of industrial activity for the country. 

INDICES OF BUSINESS CONDITIONS. 

To attempt a study of the changes in the business condi- 
tions of a country, it would be necessary to collect far more 
comprehensive data than are required tor computing indices 
of industrial activity. Professor Pigou selected the following 
series for a study of tlie changes mi business conditions of 
Kngland : 

1 . Unemployment percentage. 

2. Consumption of pig-iron. 

3. Prices in England. 

4. Kates of discount on three months’ bills. 

' 5. Volume of manufactured goods. 

(). Agi icultural production. 

7. Yield per acre of nine principal crops. 

8. Index of production from mines. 

9. Clearings of London Clearing Houses. 

10. Increase of bank credit. 

11. Credits outstanding. 

12. Annual increase in the aggregate money wage. 

13. Rate of real wages. 



286. 


statistics: thkohy and practice 


14. General aggrej^ate consumption. 

15. Proportion of resei*ve to liabilities of the Bank of 

England. 

These quantities may i)e converted into relatives referred 
to a base year. From these relatives a weighted average may 
be obtained. This weighted average shall be the Index 
Number of Business Conditions. Jt will afford a general idea 
of the average change in the Imsiness conditions of the 
country and serve as Economic Barometei* or Forecaster of 
changes in business conditions through periods of depression, 
recovery, prosperity and crisis. JUisiness conditions are 
never stationary. They do change: but the change, it has 
been found by experience in industrial countries of the west, 
particularly the U.S.A., is not fortuitous. These changes are 
also not regular and periodic. Business in general passes 
through well-defined major and miiioi* changes. Accordingly.' 
it is possible to study their order, 1o measure their present 
conditions and to give a forecast of likely position in the 
future. Both in England and in the IT.S.A. interest in this 
subject is gr-owing. 

Uses of Index Numbers. 

Index numbers reflect the movement of some quantity to 
which they relate. Their peculiar character is that they 
exhibit the relative rather than the absolute aspect of such 
movement. Index numbers are not restricted to the price 
phenomenon alone. Any phenomenon which is stretched over 
a iieriod of time and expressed numerically may be presented 
through them. They may, for instance, be designed to show 
changes in wages, values of exports and imports, prices of 
securities, production of certain manufactures, circulation of 
notes etc. Different index numbers serve different purposes. 

General jprjce index numbe rs mea sure general changes 
in prices and through them the value of money. If general 
prices, as indicated by the price index miinber for a certain 



IXDKX NUMBERS 


237 


year, (loul)le, the purehasin^* power of money for that year 
would be halved. Prices can be brou«:ht down or controlled 
through either the rej^ulation of the supply of money and 
credit, or the reiijulation of production, or Ivoth. In any case, 
index numbers will pi-ovide an apparatus to study the fluctua- 
tion of general prices and a standard for keepm*? them steady 
in the interest of consumer, trade and public finance. 

(leneral price index numbers make possible a study of 
the movements of pri(*es in different countries and of the fact 
wliether they are fairly stable. 

Cost of living- index numbers indicate throiurh their 
movement whether real vvafies are jisin^ or falling, money 
wages remaining unchangetl. They can be used to grant 
bonus to employees to meet the increased cost of living. 
Claims of labour for increase in wages, if they turn upon 
»*ising cost of living, can ]>e silled on the basis of cost of living 
indices. 

Indices of industrial activity can be utilized to study the 
progress <»f general industrialisation of a country and the 
eflVct of tarifl* on the devclopnient of paidieular industries. 
AVhen an indiistiial plan is being implemented, such indices 
are of immense use in judging the results of the policy 
adopted. 

indices of business conditions measure the change in 
the general eeonomie activity of a country and afford an 
approximate idea of the Huctuations in tlu' real national in- 
come of the conntry. Tliey can be made to forecast economic 
events. 

Investment index numbers a?*(* of great help to those 
interested in the stock market. Indices of imports and ex- 
ports give an id(‘a of the fluctuations in the foreign trade of 
a country. 

It should, however, be noted that index numbers that are 
good for one purpos^^t may not be useful for another. For 
.instance, general index number indicating a rise in general 



238 


sTATis'i'K's: riiEOUV AXJ) PUArj’ici: 


price level is not a ^ood guide for inovenient in cost ol living, 
or, cost of Jiving index jiuwhcrs for iJidustrinl vvorkezvs are 
of no use for the upper classes. 


KXERCISKS. 

(1) What is an Index Number? Why is it eonslriu'ted ? 

(2) Describe the important problems involved in the pre])ara- 
tion of an Index Number. 

(3) What considerations would weigh with you while con- 
structing a wholesale price imlex number in connection with the 
selection ui commodities and the base year? 

(t) Give a list of at least »30 representative commodities for 
India and of their representative plaees for obtaining quotations. 

(»5) Explain the ditfenmee between (piantity priee and money 
price. How wiH you utilize the former in constructing a ])rice 
index number? 

(d) Distinguish between the EixecI Hast* and the Chain Base 
Methods of constructing index nuinlnrs and discuss their relativt 
meril s. 

(7) Whii'li average, do you tliink, is approj)riate to use in 
averaging the priee relatives to arrive at the final index number? 
Give your arguments. 

(8) What is meant by reversibility of an ind<*x number? 
M hich index numbers are rev(‘rsible? 

(B. C’om., laiek., 11)30). 

(9j ‘Index numbers are economic- barometers.’ Explain tin- 
statement, and mentiorr what precautions should be taken in making 
use of any published index numbers. 

Show, with the helj) of an c-xamjile. how you would (-<)n\t*r1 
the index numbers from one basic- ]>eriod to another. 

( B. Gom.. Agra, 19K)). 

(10) Exjdain, with suitable illustrations, the importance of 
weighting in the construction of an index number. 

(11) What are the different methods of assigning weights to 
price index numbers and cost of living index numbers? Which of 
them is suitable for general use? 



INDKX NUMBERS 


289 


(12) What do you understand by 'Fiine Reversal Test and 
Faelor Rev ersal Test ? 

Explain FisJier’s “ Ideal” Method of weighting index 
nuinhers and state file difficulties tliat are to be faced in iisin^if it. 

(It) Explain Hit* whole process of constructing price index 
numbers. 

(15) How arc fhc cost of living index numbers calculated.^ 
Explain the different methods used for assigning weights to 
di fferent (*ommodit ies. 

(R. Com., Alld., 1933). 

(lb) It is desired to find the difference in the cost of living 
in the years 1939 and 19t3 in the case of (/) clerks and (il) 
industrial lalxnirers in a big industrial town. 

Explain fully the necessary procedure to l)t‘ ado))ted. 

(17) Wh at are tin* main souret‘s of errors in cost of living 
ind('X numbers? How can lh(*s(‘ errors be avoided? 

(H. ('om.. Alld., 1938). 

(18) Explain the method of studying <*hanges in the business 
conditions of any country during a giM*n })eriod of time. 

(19) Explain the meaning of E<*onomie Rarometers. How 
is this Rarorneter eonslructed. and how far it is being us(‘d success- 
fully in forecasting economic events? 

(M. A., Alld., 1938). 

(20) What are the us(‘s of index numbers? Describe their 
limitations. 

(21) Explain the use of Index Number with the help of 
the following table, which gives llu' average annual wholesale price 
of Jutt‘ in Calcutta in ruptes ])er bale of HH) lbs. for the periotl 
1911 to 1!)30: 


Vt'ar 

Rupees 

Year 

Rupees 

1911 

78 

1922 

88 

1915 

51 

1923 

78 

1910 

. . 07 

192 1 

70 

1917 

50 

1925 

112 

1918 

72 

1920 

99 

1919 

102 

1927 

70 

1920 

98 

1928 

75 

1 92 1 

9t 

1929 

71 



1930 

50 


(R. Com., Cal., 1937). 



240 


statistics: theory and practice 


(22) Given the following data, what index numbers would 
you use for purposes of comparison ? Give reasons. 


Year 

1 

1 

1 

j 

Rice 

1 Wh(*at 

Jowar 

1 Price 

Quantity 

^ Price Quantity 

Price Quantity 

1927 

i 9.3 

100 j 

(>.4 

11 

: 5.1 5 

1934 

i 

4.5 

90 ! 

3.7 

10 

2.7 3 


Priees and quantities are given in arbitrary units. 

(M.A., Cal.. U):i7). 

(23) The following table gives the index numbers of whole- 
sale prices of certain commodities in August 1941 and August 1942 
(Base: July, 1914=100). Describe critically how you would com- 
pare the average ratio of prices in- August 1941 to those* in August 
1942. 


Commodity 

Index Number of Prices 

August, 1941 

August. 1942 

Jute, Manufactures 

109 

107 

Jute, Raw 

96 

65 

Iron and Steel 

177 1 

1 177 

Sugar 

145 1 

1 215 

Coal . . 1 

80 i 

i 97 

Tea . . 1 

1 225 

1 1 

! 179 

1 


(24) Which average would you use in computing tlif* Price; 
Index Number from the following data for 194;t on the basis of 
1942.^ Give your reasons. 


Commodity 

! Unit 

194.2 ^ 

1943 

Wheat 

per maund 

Its. 

8-8-0 

Rs. 17 0-0 

Ghee 

' per maund 

Its. 

50-0-0 

Rs. 75-0-0 

Firewood 

per maund 

Rs. 

1-0-0 

Rs. OO-S-0 

Sugar 

' per seer 

Rs. 

0-9-0 

Rs. 00-4-C 

Cloth 

1 per yard 

R«. 

0-5-0 

Rs. 00-2-0 


(Figures in the above table are arbitrary) 








INDEX NUMBERS 


241 


(25) What is Chain Base Method? Describe it in connection 


with the construction of index numbers from the following data. 

Commodity (Index Numbers) 

Year 

A 

B 

C 

D 

E 

193K 

9« 

78 

82 

96 

96 

1939 

100 

82 

78 

100 

100 

1910 

112 

82 

78 

102 

101 

1911 

no 

81 

8i 

98 

98 

1912 

no 

81 

85 

98 

100 

1913 

120 

90 

90 

100 

100 


(26) Following are the group inckx li umbers and the group 
weights of an average working class family’s budget. Construct 
the cost of living index number by assigning the given weights. 


Group 

Index Number for 
January 1913 

W eights 

Food . . 

1 52 

18 

Fuel and Lighting 

no 

6 

(Nothing . . 

130 

8 

House rent . . 

100 

12 

jMiseelJancous 

90 

15 


(27) 'rile following table givt's the averagt* annual prices of 
a few eominoilil les in Allahabad for the years 1930, 1931, and 1932. 
(’aleiilatc* th(‘ Ciem ral Index Number for Allaliabad for 1931 and 
1932 on tin- basis of the ])riees in the viar 1930, using the arith- 
inctie average, median and geometric mean. Compare the results 
with those obtained by using the chain base method. 


C-ommodity 

Unit 

' 

Average 

1930 

Ils. a. p. Rs. 

Annual 

1931 

a. p. 

Price 

1932 

Rs. a. 

P* 

Wheat 

Maund 

5 

8 

0 

5 

0 

0 

1 12 

0 

Rice 

Maund 

7 

1 

0 

7 

0 

0 

6 11 

0 

Arhar 

Maund 

(> 

0 

0 

6 

0 

0 

6 0 

0 

Sugar 

Maund 

13 

0 

0 

13 

1 

0 

12 8 

0 

Salt 

1 Maund 

1 

0 

0 

3 

15 

0 

3 11 


Ghee 

Maund 

60 

0 

0 

58 

12 

0 

58 0 

0 

Oil, Kero.sene 

Tin 

1 

2 

() 

1 

0 

0 

1 2 

0 

Cloth ; 

1 Yard 

0 

10 

0 

0 

8 

0 

0 7 

6 

Fuel 

Maund 

1 

2 

0 

! 1 

0 

0 

0 12 

0 

Milk 

Maund 

5 

5 

6 

1 ® 

0 

0 

5 11 

0 


F.— 16 





242 


statistics: theorv axd practick 


CJS) Construct appropriate Index S umbers, and discuss the 
Ructuations in the quantity and value of (a) Raw Cotton, and 
(b) Raw Jute exported from India for the perimt l.OSO-’SI to 
using the average of tiie period 192(>“’27 to 1929-'30 as 

base : 



Raw 

Cotton 

1 Raw 

Jute 

Year 

Quantity 

Value 

, Quantity ' 

Value 

(Thousand 

Rupees 

(Thousand i 

Ruj)ees 


Tons) 

(Lakhs) 

Tons) ' 

( Lakhs ) 

1926— ’27 to 
1929—30 

609 

5. -til 

826 

2, .921 

(average) 

1930—31 

701 

1 

46,33 

620 

12,S8 

1931—32 1 

423 

23,45 

587 

11,19 

1932—33 

365 

20.37 

563 

9,73 

1933—34 , 

501 1 

27,53 

748 

10,93 

1934—35 ’ 

623 

34,95 

752 

10.87 

1935—36 

607 ! 

33,77 

771 

13.71 


(M.A.. Alia., 1912 ). 


(29) Construct the cost of living index number for 191-0 on 
the basis of 19S9 from the following data using the Aggregate 
Expenditure Method and the Family Budget Method. 


Article 

Quantity 

consumed 

Unit 

Pri<‘e in 
1939 

Price in 
1940 


in 

(1939) 






Rs. 

a. 

Rs. 

a. 

Rice 

. . 6 

maunds 

maund 

5 

12 

6 

0 

Wheat 

. . 6 

maunds 

,, 

5 

0 

8 

0 

Gram 

. . 1 

maund 


6 

0 

9 

0 

Arhar pulse 

. . 6 

maunds 

yy 

8 

0 

10 

0 

Ghee 

. . 4 

seers 

seer 

V 

0 

1 

8 

Sugar 

1 

maund 

maund 

20 

0 

15 

0 

Salt 

. . 12 

seera 

*>? 

20 

8 

18 

0 

Oil 

. . 20 

seers 


4 

0 

4 

1 

Clothing 

. . 50 

yard.s 

yard 

0 

8 

0 

12 

Firewood 

. . 12 

maund.s 

maund 

0 

8 

1 

2 

Kerosene 

1 

tin 

tin 

4 

0 

5 

2 

House-rent 

. , 


house 

10 

12 

12 

12 






INDKX NUMBERS 


243 


(aO) 'I'he following are 28 priee relatives that arc available 
for the construction of an index number of prices: — _ 

48, 58, 61, 61, 64, 64, 70. 71, 78, 76, 78, 81, 85, 98. 94, 
96, .96. 97, 101, 102, 189. 143, and 144. 

Regarding these as a statistical group, calculate their mean, 
median and a measure of dispersion. 

Will you select the mean or the median as the appropriate 
a\'erage for the index number in question? Give reasons for your 
selection. 

(M.A.. Cal., 1935). 

(81) Index niirnber.s seek to set aside the irregularity of in- 
dividual instances and replace it by the regularity of the big num- 
bers. — Comment . 



CHAPTEB Xm 

INDIAN AND FOREIGN INDEX NUMBERS 

\V« have already referred to some of the index numbers 
available in India in chapter VII. Here it is proposed to 
study some well-known Indian, British and American index 
numbers. 

INDIAN INDEX NUMBERS 
Current Wholesale Price Index Numbers. 

The following^ Index Numbers are being regularly 
published in India in the Monthly Survey of Business Conditions 
in India : — 

1. Calcutta Wholesale Price Index Number. 

2. Bombay Wholesale Price Index Number. 

3. Madras Wholesale Price Index Number. 

4. Cawnpore Wholesale Price Index Number. 

5. Index Numbers of Weekly Wholesale Prices of 

Certain Articles in India. 

Of the above indices the most generally used are the first 
two and the last. 

Calcutta Index Number. — This index includes 72 items 
which are divided into 16 groups. Cereals group includes 8 
items, Pulses 6, Sugar 5, Tea 3, Other food articles 9, Oil-seeds 
3, Mustard oil 2, Raw Jute 3, Jute manufactures 4, Raw 
Cotton 2, Cotton manufactures 7, Other textiles (Wool and 
Silk) 2, Hides and skins 3, Metals 6, Other raw and manu- 
factured articles 8, and Building materials 1. The prices on 
which this index number is based are the wholesale prices 
prevailing at the end of the month under review in Calcutta, 
published before October 1939 in the Indian Trade Journal and 

The Karachi Wholesale Price Index Number, based on 23 commodi- 
ties, compiled by the Commissioni^r of Labour, Sind, with July, 1914, as 
base hax not been available since June, 1942. 

244 



INDIAN AND FOREIGN INDEX NUMBERS 


245 


since that date in the Wholesale Prices of Certain Selected Articles 
at Various Stations in India, A separate index is computed for 
each group. The index for any group is the simple arithmetic 
average of thef price relatives of the articles comprised in the ^ 
group, with July 1914 as the base. Weighting is introduced"^ 
within each group by including more than one quotation for 
some items within the group. Thus under ‘ cereals ^ four 
varieties of rice are taken, whereas wheat, barley, maize and 
oats have only one quotation each. To compute the general 
index number, a simple arithmetic average is taken of all the 
individual pi*ice relatives included in the computation. The 
general index number may also be considered as the weighted 
average of the group indices, the weight in each case being 
equal to the number of items included in the group. The 
index is compiled and issued monthly by the Department of 
Commercial Intelligence and Statistics, Calcutta. It is 
published in the Indian Trade Journal,, the Monthly Survey and 
the Calcutta journal, the Capital, 

Bombay Index Number. — This index includes 40 items 
which are divided into 11 groups. Cereals group includes 7 
items. Pulses 2, Sugar 2, Other food 3. These four groups con- 
stitute ‘All food^ articles. The remaining 7 groups consist of 
‘All Non-food^ articles. Among them Oil-seeds group includes 
4 items. Raw cotton 5, Cotton manufactures 3, Other textiles 2, 
Hides and skins 3, Metals 5, and Other ra\v and manufactured 
articles 4. The prices on which this index is based are the 
wholesale prices prevailing in Bombay. Its construction is 
similar to that of the Calcutta index. Like the latter, it is also 
indirectly or implicitly weighted by taking, for instance, 2 
varieties of silk, 3 of wheat, 5 of raw cotton. Its base is also 
July 1914. The index is compiled and issued by the Labour 
Office, (fovernment of Bombay, in the Labour Gazette. Along 
with the General Index Number, group index numbers, and 
* All food ' and ‘ All Non-foml ' index numbers are also 
published. ^ 



246 


statistics: ^jhkohy" and pkaciick 


Eoonomic Adviser’s Index Number. — The index number 
of weekly wholesale prices of certain articles in India, com- 
monly called Economic Adviser’s index, is of ‘ sensitive ’ type. 
It is based on 23 commodities, which are divided into six 
groups. Weekly and monthly average index numbers for the 
23 commodities and the six groups are published along with 
the general All-Commodities index. The six groups are; (1) 
Food and Tobacco, (2) Other Agricultural Commodities, (3) 
Raw materials, (4) Manufactured Articles, (5) Primary Com- 
modities, (6) Chief Ai'ticles of Export. The prices on which 
these index numbers are based are all-India wholesale prices. 
Geometric mean is used in their construction. Their base is 
week ending 19th August 1939. Thej" are compiled and issued 
by the Economic Adviser to the Government of India. 

Inadequacy of Calcutta, Bombay and Economic Adviser’s 
Indices. — The following points should be kept in view while 
making use of the Calcutta and the Bombay indices of whole- 
sale prices: — 

(1) The price quotations for each commodity in both the 
cases refer only to one day in the month. Therefore, the 
indices cannot be regarded as sufficiently representative of the 
average price level during the month. This is particularly so 
in times of abnormal price movements. 

(2) Each of these indices relates to the price level in one 
particular market. But the difterent markets in the country 
differ considerably among themselves with regard to the rela- 
tive degrees of importance of the various articles. For 
instance, in Calcutta, rice is given a weightage of 4, while 
wheat gets that of only 1, but in the north Indian markets the 
position will be reversed. Therefoi*e, these indices are not con- 
clusive in discussions relating to All-India problems. 

Partly because of the above limitations of the Calcutta and 
the Bombay wholesale price indices, and, may be, partly be- 
cause of these index numbers being miKih higher than the 



INDIAN AND TORKIGN INDP:X NUMDKKS 


247 


Ecoiioinic Adviser’s index for the same month, the tend-ency to 
use the Economic Adviser All-Commodities index in discus- 
sions of economic problems relating to India is increasing. But, 
it should be remembei ed that this index number is not a 

• general-purpose * index. As its name implies, it is based on 

• wholesale prices of certain articles in India.’ ^ Certain 
articles ’ which it includes are not the only representative 
articles for this vast country, whose inland and foreign trades 
are of considerable dimensions. 

Therefore, the necessity of the compilation of an All-India 
general-purpose index, based on a reasonably adequate number 
of* i‘epresentativ(' commodities, cannot be over-emphasized. 
The commodities may l>e divided into Food and Non-Food 
groups, and the system now followed by the British Board 
of Trade for the construction of its index is worth 
adopting as the model. The use of proper weights, geo- 
metric mean and chain base method would place the index 
on modern and scientific lines. It is necessary to compile such 
an index number in view of the fact that the main uses of 
wholesale price indices are in relation to national economic 
pro])lems and foi* the study of general tendencies. They are 
considered in relation to movements of currency, exchange, 
foreign wholesale prices, indices of prtxluction. wages, retail 
prices etc. These purposes cannot be served by the Calcutta 
and the Bombay indices which can be utilized, for reasons 
ali-eadj'^ indicated, only in local (and not national) economic 
problems. 

Discontinued Wholesale and Retail Price Indices. 

It has been decided to discontinue the compilation of the 
following index numbers included in the Index Numbers of 
Indian Brices (quinquennial, with annual supplements) : 

1. Index Numbeis of Prices for Exported and Imported 

Articles. 

2. Index Numbers of Retail Prices of Food Grains. 



248 statistics: theoky and practice 

3. Weighted Index Number of Wholesale Prices. 

Indices of Prices for Exported and Imported Articles. — 

These indices include separate indices for (i) 28 exported 
articles, (ii) 11 imported articles and (iii) all articles. The 
all-articles index number is generally known as the All-India 
Wholesale Price Index Number. In using these indices it 
must be kept in view that they are the unweighted arithmetic 
averages of the price relatives of the various com- 
modities worked out with 1873, a rather old year, as l)ase. 
Another defect of these indices arises from the introduction, 
in the years following the base year, of a few commoditi(‘S 
whose quotations were not included in the index in the base 
year, as also from the replacement of older varieties of some 
commodities by new" ones at intervals of varying lengths. 
Largely because of these factors and also because of the fact 
that the list of articles had not been revised since 1889, these 
indicts, particularly the All-India Wholesale Price Index 
Number, had outlived their utility. The Bowley-Kobertson 
Committee w"as, therefore, not in favour of the continuance of 
the series. They have been discontinued since August 1941. 

Index Numbers of Retail Prices of Pood Grains. — These in- 
dices are the iinw^eighted averages of the price relatives of seven 
commodities, viz,, rice, wheat, jowar, bajra, gram, barley and 
ragi worked out w"ith 1873 as base. The prices used in the 
computation are those reported by Provincial Authorities. 
These quotations are based on information collected by officials 
in the tehMl or taluk centres from enquiries in bazaar areas. 
But, as pointed out in chapter the collection of these 

prices is not done with the care it deserves, with the result 
that ofl&cial figures have been much different from those sup- 
plied by the traders. Index numbers compiled from unreliable 
\ prices cannot be regarded as reliable. This fact must be borne 
in mind while making use of the indices of retail prices of food 
grains. 


* See page 70. 



INDIAN AND FOREIGN INDEX NUMBERS 


249 


Weig'hted Index Number of Wholesale Prices. — This index 
number includes 37 commodities of which 14 are articles of 
food, 17 of raw produce and 6 of manufactures. The method 
of weighting adopted is to take a number of quotations equal 
to the weight in the case of each commodity. The base year 
is 1871, but the figures have been re-calculated by shifting the 
base to 1873 for purposes of comparison with the other series 
of index numbers. In using this index number, it must be 
remembered that tho re-calculated figures are subject to a 
certain margin of error since the arithmetic mean does not, 
as pointed out in the last chapter, satisfy the time reversal 
test. It is not safe to rely on this index number as a guide to 
price movements in general for the additional reason that 
certain important articles like groundnuts, pig-iron and steel 
manufactures are not included in it. 

Cost of Living Index Numbers. 

Twenty-seveir working class cost of living index numbers 
arj? being regularly published in the Monthly Survey of Business 
Conditions in India, in addition to provincial bulletins or 
gazettes. 

Diversity in Scope and Constructioii. — These index 
numbers are compiled on different ba>se8. The cost of 
living index number for Bombay is compiled on year 
ending June 1934 as the base, that for Ahmedabad on year end- 
ing July 1927, and that for Sholapur on year ending January 
1928. They are compiled by the Bombay Labour Office. Indices 
for Nagpur and Jubbulpore are compiled by the Department of 
Industries of (\ P. and Berar with Augmst 1939 as the base, 
and are published in a special bulletin every month. Indices 
for Patna, Muzaftarpur, Monghyr, Jamshedpur, Jheria and 
Ranchi are compiled by the Department of Industries of Bihar 

•* Besides these, cost of living indices are also being compiled for 
Jalgjion in Bombay, for a few more to^Mis in the C. P., for Government 
servants drawing upto Rs. 30 per month in Meerut and Gorakhpur and 
Secretariat peons in Lucknow in the U. P., and for Bangalore in Mysore 
StatOy but they are not published in the 'manthty ^m vey. 



250 


statistics: theory and practice 


and for Cuttack by that of Orissa with average cost of living 
for five years preceding 1914 as the base. The base for the 
index number for working class cost of living for Madras is 
year ending June 1936, for indices for Lahore, Sialkot, 
Ludhiana, Kohtak and Multan in the Punjab is 1931-35, for 
index number for Cawnpore in the United Provinces is 
August 1939 and for indices for Vizagapatam, Ellore, Bellary, 
Cuddalore, Coimbatore, Madura, Trichinopoly and Calicut is 
year ending June 1936. Thus, the base period of the various 
index numbers varies from the quinquennium ending 1914 as 
in the case of centres in Bihar and (b issa to as recent a base 
as August 1939, 

For obtaining ‘ weig’hts ^ for the.se indices family budget 
enquiries have been made from time to time in some of the 
provinces. Detailed and comprehensive studies have been 
made only in a few places such fls Bombay, Ahmedabad, 
Sholapur, Madras City, Nagpur and Jubbulpore. The Cawn- 
pore index is based on the tabulations of 300 out of 1500 
family budgets of mill-workers that were collected in 1938-39 
by the Labour Office of the U. P. The weights used in the 
compilation of the indices for the Punjab centres were derived 
from only 138 family budgets of workers getting Rs. 50 or' 
less per month which were collected in connection with the 
investigations of the Royal (Commission on Indian Imbour. 
The weights used in the construction of liihar and (Iri.ssa 
indices do not rest on any adequat'O statistical basis. 

Further, neither is there any uniformity in the various 
provinces regarding the agency employed for the collection 
of prices nor regarding the frequency with which the data are 
collected. Jn some centres prices are collected weekly, in 
others fortnightly, while in the Punjab centres they are 
recorded only on the last day of each month. 

The scope of the indices also shows great variations. Al- 
most all the indices are fairly comprehensive in regard to the 
Food Group, but they show much variations among themselves 



INDIAN AND FOREIGN INDEX NUMBERS 


251 


with regard to other groups. The Jheria index does not include 
the Fuel and Lighting group, th-e indices for centres in Bihar, 
Orissa and the Central Provinces do not include House-rent, 
while the Clothing Group is somewhat unsatisfactory in most 
of the indices, firstly because some indices include very few 
items of clothing and secondly because the obtaining of com- 
parable price quotations is difficult. The Bombay and the 
Madras indices are fairly comprehensive in respect of the 
Miscellaneous (iroup, but the Bihar and Orissa ones completely 
ignore these items. 

Thus, there is a great deal of diversity in the scope and 
method of construction of the above-noted cost of living index 
numbers as between province and province. The base 
periods differ widely in time as well as in length ; the 
* weights ’ have been obtained as a result of enquiries made 
in the neighbourhood of the basic period in each case so that 
the several sci’ies refer to widely differing standards of living ; 
the agency of collection of prices and the frequency of quota- 
tions show lack of uniformity; and, some of the series ignore 
important items like house-rent and miscellaneous articles. 
For all these reasons, the Cost of living indices relating to the 
different Provinces are not directly comparable with one 
another. 

The index numbers for the centres in the Provinces o£ 
Bombay, Madras and the Punjab are similar in construction. 
The items in the list of articles consimied by the working 
classes are grouped under five heads, viz,, food, fuel and light- 
ing, clothing, house-rent and miscellaneous. Separate indices 
are worked out for the individual groups. The index for any 
group is the weighted average of the price relatives of the 
various items in the group, the weight assigned to any item 
being the ratio which the expenditure on this item bears to 
the total expenditUT^e oil all items included in the group. The 
group indices are combined into a general index in a like 
manner. >A detailed study of the construction of the cost of 



252 statistics: theohv and practice 

living index number for industrial workers in Bombay would 
clearly show the whole process involved. 

Bombay Working’ Class Cost of Living* Index.— This index 
was first published in 1921, and was based on the aggiTgate 
consumption method in the absence of any reliable weights io 
be given to different items. The Bombay Labour Office conduc- 
ted the first inquiry into working class family budgets in 
Bombay City between May 1921 and April 1922 ami a second 
inquiry between May 1932 and June 1933 to ascertain weights 
proportional to the relative expenditure on the flifferent items 
consumed by an average Bombaj' workers* family. The results 
of the second inquiry have been used in the eoiupilation of 
the revised index, commodities have been made as comprehen- 
sive as possible and the “ miscellaneous ** group has been 
added. This index and the indices for Ahmedabad and 
Sholapur, are published in the Labour Gazette issued by the 
Labour Offi'ce of the Government of Bombay. 

The items included in the revised index have lieeii <livided 
into five main groups, viz,, food, fuel and lighting, clothing, 
rent and miscellaneous. The food group includes 28 articles, 
which are: rice, patni, wheat, jowari, Imjri. turdal, gram, raw 
sugar (gu/), refined sugar, tea, four varieties of fish, mutton, 
milk, ghee, salt, dry chillies, tamarind, turmeric, potatoes, 
onions, brinjals, pumpkins, eocoanut oil, sweet oil and ready- 
made tea. The expenditure on other articles which can be 
included in the food group has been proportionally divided 
among items of like nature include<l in the food group; for 
example, the expenditure on refreshments has been added to 
expenditure on ready-made tea, and that on sweet-meats has 
been divided equally between sugar and milk. Fuel aud 
lighting group includes charcoal, firew^ood, kerosene oil and 
matches. Clothing group includivs dhotis, coating, shirting, 
cloth for trousers, sarees, and khans. The figure adopted for 
house-rent is the average rent per tenement obtained as a 
result of the 1932-33 family budget inquiry. MisoeUaneous 



INDIAN AND FOUKIGN INDEX NUMBERS 


253 


group includes barber (shave), washing soap, inedieine, 
mpari, bidis^ travelling to and from native place and news- 
papers. Thus, this index Jiuniber includes 46 articles. 

The price quotations for almost all the articles, except 
clothing articles, four varieties of fish, brinjals and pumpkins, 
are collected wet^kly by the officers of the Labour Office from 
two shops in twelve dih‘(‘rciit industrial areas. The prices of 
all the clothing articles except khans are obtained from four 
different cotton mills having retail shops in Bombay City. 
Prices of fish, brinjals and pumpkins are taken from the 
Municipal records. 

The method** adopted for computation of the index number 
is very similar to that of the British Ministry of Labour. 
Price quotations for the current year are first expressed as 
pei'centages of the prices for the base year. These percentages 
are weighted by the percentages which expenditure on each 
Item bears to the total expenditure on the group to which it 
belongs, and the products are summated. Sum of the products 
divided by 100 gives the weighted average index for each 
group. The group index numbers are again weighted by the 
percentage distiibution of the expenditure on each of the 
groups, and then divided by the sum of the weights. The 
resulting weighted average is the final index. Tlie percentages 
by which group index numbers are weighted are those arrived 
at as a lesult of the 1932-33 inquiry, except in the case of the 
‘ miscellaneous ' group whose weight is 14, and not 2o which 
it may have been in view of the fact that the sum of the 
weights (percentages) for the fii\st four groups comes to 75. 
The figure 14 represents the percentage which expenditure on 
the actual items included in the miscellaneous group bears to 

* With effort from tho index for the mouth ending loth Mav, 1943 

the method of compilation of the index number for the cereals sub-group 
has been readjusted beimuse of the unavailability of cereals like patni and 
Jowar'i in Bombay City and the appearance and disappearance of individual 
varieties of cereals in pres<*nt conditions. 



254 


statistics: theory and practice 


the total expenditure of the aveia^^e working class family. 
The percentages for the different groups are : 


Pood . . 47 

Fuel and lighting . . . . 7 

Clothing . . . . . . 8 

House-rent . . . . . . 13 

Miscellaneous . . 14 


Total . . 89 


Government of India's Latest Schemes. 

Because of the unsatisfactory character of the retail prict‘ 
index numbers included in the Index Numbers of Indian Prices 
and of the existing cost of living indices compiled in various 
centres of India, there is an evident case for constructing retail 
price index series in a scientific manner and for compiling a 
cost of living index series on a uniform basis. The Ran Court 
of Enquiry, which was appointed in August, 1940 under the 
Trade Disputes Act„ 1929 to investigate into the dispute 
regarding Dearness Allowance on the (r. I. P. Railway, made 
the following observations in para 111 of its report: — 

None of the cost of living index figures at present 

available are entirely satisfactory The first requisite 

for any satisfactory revision of the allowances that we have 
recommended is the preparation of up-lo-date cost of living 
index figures for three distinct classes of areas, city, urban 

and rural We would accordingly recommend that the 

question of pi*eparing and maintaining such figures for the 
purposes of the Central (fovernment be considered by the 
(loveriiment of India. 

Acting on this suggestion the Government of hidia for- 
mulated a centrally controlled scheme for the preparation and 
maintenance of cost of living index numbers in selected 
centres, a brief outline of w^hich was circulated to Provincial 
Governments in October 1941 to which most of them gave a 



INDIAN AND FOKKKiN INDEX NUMBERS 


255 


very eucouraging response. The Third Conference of Labour 
Ministers held at Delhi in January 1942 concluded that it was 
advisable to ensure uniformity of technique in the compilation 
of cost of living index numbers in the various provinces. 
Recently, the Government have appointed a Director, Cost of 
Living Index Scheme, to make the necessary preparations for 
the compilation of cost of living indices in selected centres of 
British India on a uniform basis. 

The Government, feeling that during the war’ period 
occasions might arise when reliable figures indicating the 
changes in the level of retail prices would be urgently re- 
Quired, decided to proceed concurrently with a scheme for the 
compilation of retail price indices for those centres for which 
cost of living indices would also be ultimately compiled. 
Owing to difficulties of organisation, it has been decided, tenta- 
tively to select 15 rural centres, being way-side railway 
stations, situated in different parts of the country, including 
Indian State territory, and attempt the compilation of their 
retail price indices. 

Thus, the Government of India are proceeding with three 
distinct schemes : — 

1. The Main Gost of Living Index Nund)er Scheme. 

2. Retail Price Index Number Scheme. Urban Centres, 

and 

J. Retail Price Index Number Scheme, Rural Centres. 

The Main Cost of Living Index Number Sdieme. — 50 

centres — 48 from different provinces of India (excluding those 
of the North-West Frontier and Madras) and Ajmer and Delhi 
— have been selected for which it is proposed to compile cost 
of living indices. Family budget enquiries are to be conducted 
in these centres by the Provincial Governments or Adminis- 
trations concerned. It is hoped that some 20,000 familyj 
budgets would be collected with a view to determine the neces- 

(Beptember, 1939 to ) 



256 


statistics: theokv and practice 


sary ‘ weights/ Th^e lists of items for the Retail Price Index 
Number Scheme have been so drawn up that if and when 
family budget enquiries in the selected centres are completed 
and ^ weights ^ ascertained, it may be possible to proceed im- 
mediately with the compilation of the necessary cost of living 
index numbers by making use of the retail price data collected 
by then. 

Retail Price Index Number Scheme, Urban Centres. — The 

centres selected for this scheme are the same as those selected 
for the main cost of living index scheme. The necessary work 
with regard to this scheme has begun,, and uevhly (flotations 
of retail prices are being obtained from some 30 centres in the 
country, and utilized, after careful scrutiny as to their being 
comparable, in the preparation of index numbers. 

E«tail Price Index Number Scheme, Rural Centres. — 
The 15 centres selected for this scheme are divided into three 
zones: Northern, Eastern and Southern. Investigations re- 
garding the food and clothing habits of the poorer se(*tions of 
the community at these centres have been completed. On their 
basis, the lists of articles for which prices are collected have 
been drawn up, and certain shops hav(* beeii fixed in each 
centre for the collection of prices regulai'ly every week on a 
day appointed for this purpose. The task oi* the collection of 
prices has been entrusted to the station inastei-s of these rail- 
way stations and their work is regularly suixMvised })y the 
Inspectors of Railway Labour within whose beat the stations 
lie. The returns, after careful scrutiny and tabulation in the 
ofRce of the Director, Cost of Living Index Scheme, are utilized 
for the compilation of monthly index numbers for all tluw 
centres. 

Industrial Activity Index. 

The inherent difficulties in the construction ui‘ index 
number of industrial activity are well-known. The absence 
of an Economic Census in India adds seriouslj^ to these diffi- 
culties. Agricultural production in this country fluctuates 



INDIAN AND FOREIGN INDEX NUMBERS 


257 


considerably with seasons and climatic conditions, and is, for 
this reason, not suited to short-term enquiry covering a 
month or a quarter of a year. Besides, statistics of agri- 
cultural production are incomplete and not fully reliable. 
Therefore, in spite of its out-weighing influence, agricultural 
production cannot be included as a constituent of the Produc- 
tion Index. Bowl ey-Robert son Committee in their report re- 
commended that an industrial production index should not be 
combined with that of agricultural production. If it is com- 
bined, the w^eight to be given to it would be of so considerable 
a magnitude that it will swell up the general index of Indian 
business activity very high. 

Even the statistics of industrial production are not quite 
sufficient in India. In spite of this. Capital^ the well-known 
weekly journal of Calcutta, has been publishing every month 
an Index of Indian Industrial Activity since March 1938. 

“ Capital ” Index of Indian Industrial Activity. — This 
iiulex is published monthly and 1935 is taken as the base year. 
The series selected and the weights assigned to each item for 
computing this index are: — 


Series Selected Weight 

Industrial Production — 

1. Cotton Manufactures .. 9 

2. Jute Manufactures . . . . 6 

3. Steel Ingots . . . . 5 

4. Pig Iron . . . . . . 8 

5. Cement . . . . . . 5 

b. Paper . . . . . . 3 

Miuei al Production — Coal . . . . 7 

Kail k River-ljorne Trade . . . . 24 

Financial Statistics — Cheque Clearances . . 20 

Trade, Foreign & Coastal — 

Exports . . . . • • 4 

Imports . . . . • • 3 

Shipping, Foreign & Coastal — 

Tonnage entered . . . . 3 

Tonnage cleared • . . . . 3 

F.— 17 



258 


statistics: theohy and practice 


Since March 1941 Trade, Foreign and Coastal, and 
Shipping, Foreign and Coastal, have been left out. Instead, 
J^otes in circulation (base: April, 1935 to March, 1936) with 
weight , 6 and Consumption of Electricity wdth weight 7 have 
been included. The weighted geometric mean forms the 
general index and seasonal fluctuations are eliminated by 
means of a twelve months^ moving average. Index for 
cement appeared up to 1937-38 and has since then been dis- 
continued with the remark figures not available \ A speci- 
men of the construction of this index is given in table 17, 
chapter X. 

Statistics for the above series are taken from the monthly 
publications of the Department of Commercial Intelliuencc 
and Statisti(*s and from Statistical Summary ot* the Reserve 
Bank of India. This index does not aft*ord an idea of tlu' 
activities of people living in rural areas. And. even so far as 
urban people are concerned it is not fully representative. It 
does not include the production of sugar, tea, hides and skins 
which are quite important in the Indian industrial structure 
to-day. However, in the absence of complete and adccpiatc 
statistical data no better index could be compiled. 


BRITISH INDEX NUMBERS 
Wholesale Price Index Numbers. 

Three important wholesale price index numbers that arc 
compiled and maintained in (d-eat Britain are: — 

(1) Board of Trade Index Number. 

(2) Economist Index Number. 

(3) Statist Index Number. 

Board of Trade Index Number.— The present series relating 
to this index begins with January 1935 and replaces an older 
series dating from 1920, w^hich had replaced a still older series 



INDIAN AND FOKJEIGN INDEX NUMBERS 


259 


designed before the last Great War. The total number of 
commodities included is 200, and the total number of quota- 
tions is 258„ the difference being due to the fact that in certain 
cases the average of more than one quotation is used to gel 
a better representative figure. The commodities include food 
articles, materials of industry and semi-manufactured goods 
and are arranged in 11 groups. Quotations are based upon 
market values. The index is not weighted in the ordinary 
sense of the word but is indirectly weighted by using two or 
more quotations for articles of special importance. Price 
relatives are calculated upon the chain base method. Geo- 
metric mean of the 11 groups is extracted on a footing of 
equality. The base year has been successively 1913, 1924 and 
1930. The index is published in the Labour Gazette. 

Eioonomist Index Number. — This index numl>er was ori- 
ginally framed in 1864 and has been revised twice: in 1911 
and in 1928. In its present form it comprises 58 commodities 
with 1927 as the base year. Formerly arithmetic average 
was used in its construction, but now unweighted geometric 
mean is used. Results a?-e published monthly and fortnightly". 
It is compiled by the Economist, an important periodical of 
Great Pritain. 

Statist Index Number. — This index number, is really a 
continuation of a series begun by the late Mr. Augustus 
Saurbeck who used 44 commodities and selected as his base 
the average of the monthly wholesale market prices of these 
commodities in the period 1867-77. He weighted the index 
not directly, but indirectly by taking two or more quotations 
lor ai-ticles of special importance. This method has also been 
followed by the Statist, a periodical of Great Britain, which 
continued this index from 1912. The same, base is being 
maintained even now. In its present form it is based on the 
wholesale prices of 19 foodstuffs and 26 raw materials. These 
45 commodities are arranged in 6 groups. This index is 
valuable where a continuous record of figures over a long 



260 


STATlSriCS: thkoky and practick 


period is required, since il is presented in almost tlie same 
form in which it originated and since its compilers publish 
every year full details of its construction. 

Cost of Living Index Number. 

The most important index number is that compiled by the 
British Ministry of Labour. 

Ministry of Labour’s Cost of Living Index Number. — This 
index number is designed to measure the average increase in 
the cost of maintaining unchanged the pre-War standard of 
living of the working classes. The foodstuffs include<l repre- 
sent about 75 per cent, of working class expenditure on food. 
Retail prices are obtained from over 5.(XX) retailers, distributed 
among over 500 towns and villages. The weights used are 
based on the average expenditure of 1944 urban working- 
class families. This information was collected l)y the Board 
of Trade in 1904. Prices in July, 1914 are used as the base 
of the index. The use of weights relating to 1904 instead 
of 1914 is considered reasonable on the ground that no great 
change took place in the standard of living between 1904 
and 1914. 

The weighted average increase in the I'clative prices of 
foodstuffs is combined with similar figures showing changes in 
rents, clothing, fuel and light, and other items. The weights 
used are, food 7^, rent 2, clothing IJ, fuel and light 1, miscel- 
laneous total 12J. The final index, along with the five 
group indices, is published monthly by the Ministry of Labour, 
No allowance has yet been made for any changes in the 
standard of living or for any economies or re-adjustments in 
consumption and expenditure since 1914. 

Indices of Production. 

The two important indices are ; 

1. London and Cambridge Economic Service Index of 
Physical Volume of Production. 



INDIAN AND FOREIGN INDEX NUMBERS 


261 


2. Board of Trade Index of Industrial Production. 

London and Cambridge Index. — This index includes agri- 
culture and manufacturing and extractive industries. Changes 
in the physical volume of production indicate the extent to 
which the country's resources are being used in industry, and 
also indicate the results in terms of consumable goods. The 
index is calculated in tw'o forms: (1) An annual index, and 
(2) a quarterly index. Information in the annual index is 
labulated iindei- the following heads: 

( I roup 1. Agriculture. 

11. Principal Minerals. 

„ 111. Iron and Steel, Engineering & Ship- 

building Trades. 
l\\ Non-Ferrous Metal Trades. 

„ V. Textile Trades. 

.. VI. Food, Drink, and Tobacco Trades. 

Vil. Chemical and Allied Trades. 

Vlll. Paper, Printing and Allied Trades. 

,, IX, Leather Trades. 

X. India-rubber Trade. 

XI. Building and Contracting Trades. 

The quarterly index is compiled upon the same general 
principles as the annual index subject to the omission of 
certain information which is not available quarterly. Since 
1929, w'eights assigned have been proportional to the net out- 
put of industries obtained as the result of the Census of Pro- 
duction of 1924. The system of weighting adopted is, 
therefore, base-year weighting. 

Board of Trade Index. — This index differs from the annual 
(but not the quarterly) index of the London and Cam- 
bridge Economic Service by the exclusion of agriculture. 
But, certain branches of industries not covered by the latter 
are included in this index. The industries are classified into 



262 


statistics: theory and practice 


groups comparable, so far as possible, with the grouping 
adopted for Census of Production of 1924, viz , — 

(1) Mines and Quarries, (2) Iron and Steel and Manu- 
factures thereof, (3) Non-ferrous Metals, (4) Engineering 
and Shipbuilding, (5) Textiles, (6) Chemical and Allied. 
Trades, (7) Paper and Printing, (8) Leather, Boots, and 
Shoes, (9) Food, Drink and Tobacco, (10) (las and Electricity. 

The objective of this index is the net output of the vai'ious 
industries, i.e,, the excess of the value of the products over 
the value of the materials utilized in their manufacture. 
Agriculture is excluded from the inquiry because of the 
fluctuations in agricultural production with seasonal climatic 
conditions and the consequent unsuital)ility of such production 
for an inquiry covering less than a year. The method actually 
adopted is to compare the best available statistics measuring 
the volume of production in the current quartei* with the 
corresponding figures for 1924. AVeights are assigned in 
proportion to the net output for 1924. 

Indices of Business Activity. 

Literature on the subject of business barometers and 
business activity indices is voluminous. The London and 
Cambridge Economic Service issues a monthly bulletin of com- 
parable statistics upon every imaginable branch of economics 
and finance, such as, prices and wages, output and internal 
activity, foreign exchange.s, finance. The lioard of Trade, the 
Bank of England, the Economic Advisory Council, among others, 
issue periodical tables and charts relating to general econo- 
mic conditions. The most important index, designed parti- 
cularly for a study of business conditions, is that of the 
Economist, 

'' Efoonomist ” Index of Business Activity.— This is a 
monthly index and goes back to 1924. It was revised in July, 
1936, and recalculated with 1935 as the base year. Its object 
is to measure changes in the economic activity of United 



INDIAN AND FOUKIGN INDEX NUMBERS 


263 


Kiiifidoni in quantitative — not monetary — units. That is, it is 
desif.nied to afford an approximate idea of fluctuations in the 
'' real national income. The component series of the index 
and their respective weif^hts are: — 

(1) Employment 10, (2) Consumption of coal 4, (3) 
Industrial Consumption of Electricity 2, (4) Merchandise on 
Railways 4, (5) Commercial Motors in use 2, (6) Postal 
Receipts 3, (7) Building? Activity 2, (8) Iron and Steel avail- 
able for Consumption 2, (9) Consumption of Cotton 1, (10) 
Import of Raw materials 2, (11) Export of British manu- 
factures 3, (12) Shipping: movements 2, (13) Metropolitan, 

Country and Provincial Bank clearings 4, (14) Town 

cleariiifr 1. 

All the series excepting buildin*^ activity are corrected 
for seasonal fluctuations, Weij^hted geometric mean of the 
constituent series pives the Business Activity Index. 


UNITED STATES’ INDEX NUMBERS. 

Wholesale Price Index Numbers. 

Following’ are the well-known wlu)lesale price index 
numbers in the U.S.A.: — 

1. Bureau of Lal)our Statistics\ 

2. Federal Reserve Board’s. 

3. Dun’s. 

4. Annalist's, 

5. Fisher’s. 

Bureau of Labour Statistics’ Index.— This index number 
was a weighted average of relatives upto 1913 based upon the 
aveja^e price of 1890 — 1899. Since 1914 this index is the 
weighted a^^^^re 4 ?ate of actual prices, and the weights now 
assigned are the amounts of goods marketed in 1919. Prices 
of 450 commodities are regularly collected by the Bureau. 



264 


statistics: theokv and pkactice 


These commodities are arranged in 9 groups. Monthly and 
annual indices for the commodity groups, separately and 
combined, and reduced to relatives on base year, 1913, are 
published in The Monthly Labour Review, and in W holesale Price% 
both issued by the Bureau of Labour Statistics, Washiiigtoii. 

Federal Reserve Board s Index. — This index of wholesale 
prices has been prepared since October 1918 — the series being 
calculated back to 1913. The price quotations, coniniodities 
and the method of calculating the index are the same as those 
of the Bureau of Labour Statistics’ index, except that commodi- 
ties are grouped in three major classes: raw materials, produ- 
cers’ goods and consumers’ goods. Monthly and annual 
indices appear in The Federal Resen^e Bulletin, Washington. 

Dun’s Index. — This ind€‘x number is based upon the 
wholesale prices of about 200 commodities obtained from the 
principal markets of the U.S.A. The commodities ai’e grouped 
into 7 classes, and weights are given according to average 
annual per capitei consumption. The weighted aggregate of 
actual prices yields the required index, which is pul)lished in 
Duns Review, New York. It is an index issued by private 
organisation. 

Annalist’s Index. — This index is eumpute<l by tlie 
Annalist, a New York financial journal. It is based iipoti 
25 food products. The quotations are taken from Chicago and 
New York markets and are .selected, it is held, so as to ))e. 
representative of a theoretical family budget. This imlex is a 
simple arithmetic average of relatives, with average piicc for 
1890 — 99 as the base. No explicit weighting is used. Weekly, 
monthly and yearly indices are publishe<l in the .journal. 
This index is also issued by non-government organi.satioii. 

Fisher’s Index. — Professor Irving Fisher of Yale Univer- 
sity publishes weekly through a syndicate of American news- 
papers an index number of wholesale prices and its recipro- 
cal, the purchasing power of dollar. The .series began in the 



INDIAN AND FOUKIGN INDKX NUMBEUS 


265 


first week of January I92J. The quotations are taken from 
Duns Review, It is a weighted ag:uregate of prices of 205 
commodities, the actual quantities of each commodity sold in 
1919 bein^* the weij?hts for their respective commodities. The 
year 1913 is used as the base. This index is also compiled by 
private orjranisation. 

Cost of Living Index Number. 

An important cost of living index number issued by the 
Ignited States (lovernment is that compiled by the United 
States Ihxreau of Labour Statistics. 

Bureau of Labour Statistics* Index of Cost of Living. — 

This index number has been published by the Bureau since 
1918, although the data have been computed back to December 
1914. The price quotations refer to commodities consumed 
by working class families. They are, in some cases, submitted 
by storekeepers and are collected, in other cases, by Bureau’s 
field agents. The groupings are: (1) food, (2) clothing, (3) 
rent, (4) fuel and light, (5) furniture and furnishings, and 
(6) miscellaneous items. The system of double weighting, as in 
the Bombay Uost of I/iving Index Number in India, is adopted. 
AV eights are based upon the result of a study of more than 
12,000 family budgets in 92 localities in the U.S.A. The year 
1913 is used as the base. (Iianges in the cost of living for the 
country as a whole and for <lift’erent cities are regularly 
publishe<I in the Monthly iMhour Kevwu\ U.S.A. Bureau of 
liabour Statistics. 

Indices of Production. 

Anumg these indices, those compiled by Stewart, King and 
Snyder are important. The Harvard Committee on Economic 
Research also prepares them. 

Harvard Committee’s Index of Physical Production. — This 
Index is a quantity index prepared, separately and combined, 



266 


statistics: theoky and practice 


for agriculture, manufacture and mining. Annual amounts of 
production of ditferent items in these groups are expressed 
as relatives of the production in 1909, the base year. 
Weighted geometric mean of the group index numbers gives 
the combined index. The indices for the groups and the 
combined index are issued as adjusted and unadjusted. 

Indices of General Business Conditions. 

While dealing with indices of business conditions in 
chapter Xll it Avas pointed out that business in general, and 
certain of its phenomena in particular, pass through well- 
defined major and minor changes, so that it is possible not only 
to measure their present conditions but also to forecast what 
the future trend is likely to be. This service is performed l)y 
the Harvard Committee on Economic Research through its 
“ Index of General Business Conditions.’’ 

Harvard Index of General Business Conditions. — As a 

result of an elaborate study of the data for the period 1903 
to 1914 it was found that there was a secjuence in the move- 
ments in the speculative, Inisiness, and money markets which 
could be .statistically mea.sured, an 1 graphically presented. 
Accordingly, three curves. A, B, and C, for Speculation, 
Business and Money re.spectively are presente<l on a chart 
which generally shows that movements in Curve A })recede 
those in Curve B, and those in Curve 1> precede those in Curve 
C. This movement occurs with .such a regularity of .sccjuence 
that the three curves afford a logical basis for scientific 
busint'ss forecasting. 

The following series was used in the chart covering the 
trial period, 1903 to 1914: — 

Curve A — Speculation : 

New York Bank Clearings 
Prices of Securities 



INDIAN AND lOKEKJN INDEX NUMBEUS 


267 


Curve — Business : 

Wholesale (Commodity Prices 

Hank Clearinj^s Outside Of New York City 

Pi^-iroii Production 

Curve — Money: 

Interest Kate on Commercial Paper 

lioans and Deposits of New York City lianks. 

1'he index was presented in th‘3 form of a chait. The 
followinj^ series was used for the period 1919-1924: 

( 'ur\ e A — Speculation : 
l»ank debits 
Industrial Stock Prices 

Curve H — Business: 

Hank debits for 140 cities outside New York City 
('yelical Index of Commodity Prices. 

Curve C — Money: 

Kate on 4-G months ^ood Commercial Paper 
Kate on 4-H months prime Comniercial Paper. 

The.se new curves, altlnm^h based on different data, have 
similar function to perform. 

In ad<lition to the above is the h\)recastin^^ Composite Line 
])r<*pared by the Hrookmire Ei onomic Sen ice designed to fore- 
(*ast stock and comimxlity prices. 


EXERCISES 

(1) Whal is the objective of an index nuinlx^r.^ State briefly 
the relevant conditions for its construction illustrating your answer 
by reference to the Index Numbers of Prices published by the 
Government of India. 


(B. Com., Bombay, 1936). 



268 


statistics: thkowy and PKArncK 


(2) IiT what respects are the Calcutta Index Numl>ers of 
Prices defective? What iinj)rovement would you sijg^cst to make 
them more representative? 

(B. Com,, Luck,, UfiiH), 

(3) Describe any index numher in use in India at preserrt for 
measuring changes in the wholesale price level, and ]>oint out its 
shortcomings. 

(M.A.. Cal., 1937), 

(4) Explain the whole process of studying the changes in 
the cost of living of cultivators in the C. P. during the next ten 
years. 

(B. Com., Alld., 1939). 

(5) How will you construel a cost of living index numher 
of an Indian middle class family? 

(M.A., Alld., 19a7). 

(6) How would you measure the cost of living in the United 
Provinces for a series of years? What are the ditfi(*ulties involveil, 
and how may they he solved? 

(B. Com., Alld.. ). 

(7; Describe carefully how you would proceed to construct 
the cost of living index numtM^rs for the U.P. (for the benefit 
of imlustrial labour). Would you allot weights according 
to ‘Fisher’s Ideal .Method’ or Family Budget Method? (iive 
reasons in support of your answer. 

(M. (’om., Alld.. 

(cS) Explain clearly how the “ ('apital ” Index of Business 
Activity in India is calculated. How far do you consider it repre- 
serrtativc? 

(B. (’om.. Alld., 1910). 

(9) What statistical material would you utilize for preparing 
an Index of Economic Activity in India? How would you collate 
your data? 

(M. Com., Luck., 1942). 

(10) Name the important Wholesale price index numliers and 
Cost of living index numbers published in India, Englaird and the 
U. S. A*, and explain the construction of at least one of each type 
in each of the tliree countries. 



JNDIAN AND FOKKIGX INDEX NUMBERS 


269 


(11) Wliat is thf function of index mimbers of business con- 
ditions r Explain it with an illustration of an actual index number 
of business conditions published in the U. S. A., or England. 

(12) Write brief ex])lanatory notes on the following: — 

(1) Saurbecks Index Number. (2) Elie Annalist Index 
Numln^r, (H) Hoard of 'Erade Index Number, (4) 
The Statist Index Number, (5) The Economist 
Index of Husiness Activity, (d) ‘Capital’ Index of 
Indian Industrial Activity, and (7) Bombay Cost of 
Living Index Number. 

(l.*l) How will you make an estimate of the ‘dearness allow- 
ance ’ that may be j^roposed to be given to industrial labour in 
Cawnpore due to rise in the cost of living since the outbreak of 
tl)c present War.^ 

(II) If you are required to study the changes in business 
coiwiitions in India, on which problems will you collect the infor- 
mation from official and non-official sources.^ 

(15) Point out the defects in the existing cost of living indices 
in India and explain the scheme of the (iovernment of India to 
i'ompih* and maintain cost of living indices on a uniform basis. 



CHAPTEB XIV 


DU6EAMMAT1C REPRESENTATION 

An important function of the Science of Statistics is to 
present complex and unwieldy data in a manner such that they 
would be readily intelligible. Classification and tabulation 
constitute the first step towards the attainment of this objec- 
tive; but even tables containing, as they do. a number of 
figures do not enable one to grasp the whole data at a glance. 
Computation of relative numbers, .statistical averages and 
index numbers constitutes further .step in the direction of 
condensing the tal)ulated data. Hut still the condensed 
material is presented in numerical form. Numbers are not 
interesting to all. To many they are dull and confusing; ami 
if their number is pretty large, it would be difficult f(t compare 
them and observe their differences. A long li.st of death 
rates and birth rates, to take an example, relating to a large 
number of towns in a country, or to different countries of the 
world, would tire one’s eye and confuse his mind. It woul<l 
not be easy for him to note the differences in death rates and 
birth rates of different towns, or countries as the case may 
be. Therefore, it is necessary to adopt a device which may 
present huge ma.ss of quantitative data, or their condensed 
form, in a w’ay that is at once comparable and appealing both 
to the eye and the intellect. For thi,s purpose, the method of 
visual aids which comprises of presenting statistical material 
in pictures, geometric figures and curves has been devised. 

Usefulness of Diagrams. 

Diagrams carry with them the merits of attraction and 
effective impression. One may not like to devote 

270 


even a 



DIA(;UAMMAT1C REPRESENTATION 


271 


minute to the study of a page — a small page — containing a 
number of quantitative figures; and, even if he devotes time, 
numerical figures may go out of hiS mind soon after he has 
studied them. But the same person may not — in most cases, 
would not — like to take his eyes away from a picture relating 
even to the same topic to which the numerical data did. Xay, 
he might invite others to have a look at the picture. And, if 
the picture has really attracted him, it need not be said that 
it would leave an effective impression on him. This is based 
on human psychology, and a successful advertiser or propa- 
gandist always exploits this psychology of the people to win 
his mark. A manufacturer of soap bars advertised for a con- 
siderable time that his bar. having the same price as that of 
his competitor, was much heavier than his competitor's, but 
did not find any improvement in his sales. And, when he 
advertised in pictorial form — a balance containing his bar on 
one side and his competitor's on the other, the pan containing 
his touching the grouiul, while the other much above the 
ground, and the words “ For the saiiie price ’’ beneath the 
picture — he found to his pleasant surprise that the demand 
for his bars increased s(» much that he had to extend his 
plaqt. 

It follows from the above example that diagrams are not 
only attractive and impressive, but also have the merit of 
rendering the whole idea readily intelligible. A man, to take 
another example, who has never seen or handled iiiore than 
a few hundreds of anything may not under*stand how large 
the city of Bombay would be if he is told that its population 
in 1941 is 1,490,000. But, if he is living in Nagpur and is 
told that Bombay is nearly five times of his own city in the 
size of population, he would, no doubt, try to undei^stand w^hat 
it means. The idea, however, shall be more easily and readily 
graspeil if this fact is represented to him diagrammatically — 
c.g., an area may be divided, into five equal parts, one of 
which may be shaded and named Nagpur, while the whole of 



272 


statistics: THKORY A\I> PRACn^ICK 


the area would show the population of Bombay. It would bo 
clear at a glance that Nagpur is only onc-f5fth of Bombay. 

Another merit of diagrams is the ease with which they 
make comparison possible. l\)pulation of Nagpur with 
Bomhay^s, or the weight of one bar with that of the other, io 
the above examples, can be (juite easily and reailily compare<l. 

Vet another characteristic feature of <Iiagrajn.s is that 
they save much valuable time, which would otluo wise be lost 
in grasping the significance of numerical data. 

Lastly, a chief merit of diagranis and graphs is that the 
entire data, which expressed in numerical form may be un- 
wieldy and require a number of pages to write down, are 
made visible at a glanoe. 

For these merits of theirs, diagranis are ver} useful in 
economic and social studies. A purely theoretical economist 
finds in them the basis for logical reasoning and easily explain- 
ing an economic law', s\ich as the bnv of substitution or* of 
diminishing utility. A practical economist may make his ideas 
impressive through diagrammatic representation. Know- 
ing that the expenditure of the eleven provincial governments 
in India in 1940-41 on Industries totalled 115 lakhs of rupees 
and that on Police 1,120 lakhs of rupees, he w'ould do well to 
represent his idea diagraniniatically rather than (juoting the 
figures. When a social reformer is addressing an audience, 
mere reading out of figures w’^ould make the hearing dull, 
tedious and tiring. But, if he appears on the platform wdth 
pictures, diagrams and graphs, his talk w'ould be interesting, 
lively and impressive. A businessman, or an .administrator, 
has hardly any time to devote to the study of a huge mass of 
figures, however well arranged. But, if he is presented with 
graphs showing the rise and fall of a certain activity, or with 
pictures and diagrams, it will hardly take him more than a few 
minutes to grasp the significance of the whole. It is, thus, 
evident that diagrams, charts, pictures, graphs and similar 



diagram m atic rkpuesextation 278 

other visual aids serve a more useful purpose than any other 
device. 

But, diagrams can be as much misused as they are useful. 
In advertisements and political propaganda they are often 
deliberately misleading, though literally correct. The true 
statistician has to guard himself against mis-repre.sentation. 
Hence some general directions for drawing diagrams. 

Directions for Drawing* Diagrams. 

It should be remembered that diagrams do not add any- 
thing to the meaning of statistics. They afford only a method 
of presentation. However, when drawn and studied intelli- 
gently they bring to light the features of statistical groups 
and series; they show the various components of a group in 
relation to each other and to the group as a whole; they show 
the unity that underlies the scattered figures. They are, 
therefore, only a means to an end, the end being to make 
comparisons. Consequently, if there is only one isolated 
numerical quantity, there is no sense in presenting it diagram- 
mat ically. Similarly, if theiH* are many figures, in no way 
lelated to one aiK»ther and, therefore, having no common 
characteristic, they are ineomparable and, therefore, need not 
be diagrammatically presented. For example, if ^^e know 
that the monthly expenditure of a eertain student is Rs. 50, 
his age is 22 years, the length of his nose is 1.325 inches and 
he has 20 books, we eannot represent Rs. 50, 22 years, 1.325 
inches and 20 books by any kind of diagram, since the four 
numbers are incomparable. On the other hand, if we know 
that one student is 60 inches long and another 48 inehes, 
we can compare the two and, therefore, represent them diagram- 
matically. It is, then, established that the method of diagram- 
matic representation can be made use of when there are at 
least two numbers which are similar in nature and character 
at least in one important respect and also vary independently 
of each other. 

F.— 18 



274 statistics: theory axd practice 

Another point that should be kept in view is that diagrams 
are not tlye substitutes foi; the reed magnitude of the quantity 
thqy represent. The size of a diagram changes with the 
change in the scale to which it is drawn. The same quantity 
drawn to two different scales will yield diagrams of different 
sizes. 

In the technique of diagram drawing, it is evident from 
the above, the selection of the proper scale occupies an im- 
portant place. No rigid rules can be laid down for the selec- 
tion of a proper scale, but a general direction that can he laid 
down for the purpose is that that scale is the most suitable the 
diagram drawn to which would be neither so big as not to 
be visible at a glance nor so small as to look clumsy and 
indistinct and cover only a very small part of the space avail- 
able. The scale should be so chosen that the size of the 
resulting diagram would show the significant features of the 
numerical quantities for which it stands. All principal 
details must be clear. The vertical scale should be marked at 
equal unit spaces and the measurement of each unit space put 
down, fienerally, the vertical scale should be shown on the 
left-hand side of the diagram. The horizontal scale should 
be given at the bottom of the diagram. l)n each side, the 
vertical and the horizontal, the thing represented should l>e 
indicated. For instance, the vertical line might show the 
amount in rupees and the horizontal the different countries. 
The diagram should be neatly drawn with the help of drawing 
instruments. It should be given a suitable heading. The 
data represented diagrammatically should be given tm a page 
adjacent to the one on which the diagram is drawn. If these 
data are indicated in the diagram itself, care should be taken 
to see that the quantities are so placed in the diagram that 
they do not distort the visual impression conveyed by the 
diagram. To make distinctions clear, various kinds of 
dotting, lining, crossing, cross-hatching, or colouring should 
be used. 



275 


DIAGKAMMATIC KEPKKSKNTATIOX 

The drawing of diagrams, as a matter of fact, is not so ^ 
difficult as the selection of suitable types of diagrammatic 
forms to depict a concise picture^ of the statistical data in 
hand. In selecting the most suitable diagram. from among 
the varied forms of diagrams, the criterion should be that the 
diagram selected should lead most quickly and with the 
greatest accuracy to the real meaning of the quantitative data. 
The test of a suceessiul selection lies hi the speed with which 
the quantities can be accurately studied with the help of the 
diagram selected. 

Different Forms of Diagram. 

Diagrammatic representation can be made in any one of 
the following ways; 

(1) (hie (limen.sioiial diagram, e.g.. lines or bars drawn 
to a common scale. 

(2J Two dimensional diagrams, c.g., squares, and 
rectangles whose areas are made proportional to 
the given figures. 

(o) Circular and angular diagrams, c.g., circles whose 
areas are made proportional to given magnitudes, 
and which may be divided into sectors whose 
common unit is the degree. 

(4) Three dimensional diagrams, c.g., cubes, cylinders, 

blocks whose volumes are made proportional to 
the given figures. 

(5) Pictograms, c.g.. statistical maps and pictures. 

Technically there is no objection to using squares, rec- 
tangles, cubes, circles and pictures; but in practice, lines, bars 
and angular diagrams are the easiest to draw. They can also 
be made sufficiently accurate. Therefore, so far as possible 



276 


statistics: theory and practice 


^ they should be preferred. The terms diagram, chart, and 
graph are very often used without distinction, the same figure 
being given any of these names. We shall use the term 
diagram for the various forms pointed out above, and the term 
graph for curves only which would be dealt with in later 
chapters. 

One Dimensional Diagframs — Simple Bar. 

A bar is merely a thick line whose width, though shown 
in the diagram, is not taken into consideration in representing 
the diagram. It is shown merely to make the diagram look 
attractive. Therefore, those diagrams in which only one 
dimension is considered are called one dimensional diagrams. 
Now, a bar may be shown as a simple ))ar or it loay be 
divided into parts. Those one dimensional diagrams in which 
the bar is not sub-divided are called Simple bar diagrams. 
We consider them first. 

To draw a bar diagram the height of the biggest bar 
should be adjusted to the size of the diagram. Some margin 
should be left all round the diagram to write down the title 
and the designation of units and scale. The width of the 
bars should be neither too big nor too small. If the number 
of items is very large and the space is very limited, thick 
bars may be replaced by thin lines as done in figure 15. The 
bars should be drawn to a common horizontal or vertical base 
line; generally the horizontal base is used on which bars are 
made to stand vertically. Horizontal base is used for the 
simple reason that comparison of one bar with another can 
better be made in term^l of height. In a single stiuiy all the 
bars must be of the same uniform width, separated by equal 
intervening spaces. Bars may be coloured, lined or dotted, 
but the colour used or lining or dotting done should be the 



DIAGRAMMATIC REPRESENTATION 


277 


same in all the bars iu a siiig^le study. If bars are made to 
touch each other, that is, no intervening blank spaces are left, 
the dia^i*aiu would look a continuous and blurred one with its 
top disfifrui*ed. Bar diagrams are not suitable for presenting 
continuous series such as that spread over a period of time. 
For this purpose graphical methods of presentation are used. 
Bar diagrams are suitable for representing discrete series. 

Table 34 gives the yield of certain food crops in India 
for 193b-37.* w^hich are diagrammatically represented in 
figure 3. 

Table 34. Yield of certain food crops in British India 
(including minor sMes ) — 1936-37, 


Cr(‘p 

Wheat 

4 

Jowar 

C(ine 

(bam j Barley Maize 

I 

Yield 

(1,000 Tens ) 


0,289 : r),40l 

i 

3,817 1 2,311 ' 1,836 

i 


The highest yield to be represented is 8,513,000 tons. 
A suitable scale has been selected to represent this yield 
properly, and the yield of other crops has been reduced to 
this scale. The lower ends of all the bars have been placed 
on the common horizontal base, so that comparison can be 
made betw^een yields of different crops by comparing their 
heights. To make comparison easy, the bars have been arranged 
in descending order ; they could have been arranged in ascend- 
ing order as w^ell. The scale is put down on the left-hand 
side, a little away from the biggest bar. Names of the crops 


* Thomas aatl Sastry, In^iian Jffricuttural Statistics, page 119. 





278 


STATISI'ICS: THEORY AND PRACTICE 


are given below each bar. A suitable heading is given at 
the top. The bars are separated by equal blank interspaces. 


Yield in "fons 
f 000,000) 
9 


Yield of i^riain food crops in British 
India (inctnding minor States) in 
193()*37. 



Sugar- Jomr Gram Barley Maize 
cane 

Fig. 3 


111 interpreting figure 3, it should not he said that the 
yield of wheat was the largest in India in 1936-37, for there 
might be some crop other than the crops shown in the diagram 
whose yield may have been higher. Actually, the yield, of 
rice in the same year was 27,143,000 tons. It can, of course, 
be said without any mistake that among the six crops i*e- 
presented in the diagram the yield of wheat was the highest 
and that of maize the lowest in 1936-37. 

In figure 15, heights of 55 boys have been shown by verti- 
cal lines, the method of construction being nearly the same 
as that of figure 3. 

If instead of yield of food crops in India, we were given 


D J AGIUM M ATI C KE PKESEX TATION 


279 


the imports, death rates, or income per capita of different 
countries of the world a similar method would have been 
followed, the base line showing? the countries and the vertical 
line the quantities relatinj^ to them. A^ain, if we w'ere 
jiiven imports and exports of different countries and it 
was desired to compare imports of different countries, 
expoits of different countries and imports and exports 
of the same country, we could have extended the 
principle of simple bar diagram for this purpose as well. 
Choosing the horizontal or the vertical line as the base we 
could have placed two bars, one representing imports and the 
other exports of the same country, and separated this pair 
by an intervening blank space from another similar pair for 
another country, and so on, colouring the bars showing im- 
ports with one colour and those showing exports with another 
colour. This method is somewhat complex for simple com- 
parisons, and can be replaced by that of sub-divided single 
bars. 

One Dimiensional Diagrams — Sub-divided Bar. 

If a given magnitude can be broken up into the parts of 
which it is composed, or if there are independent quantities 
constituting the sub-divisions of a total, bars sub-divided in 
the ratio of the different components may be used to show 
the relationship of the parts to the whole. For instance, if death 
rates and birth rates of different countries are given, bars 
may first be drawn to repi^seut the births, and from these 
bars portions from the bottom of the bars may be cut out 
in proportion to the death rates and coloured black to dis- 
tinguish from the remaining white portion which would show 
the survival rate. Again, if the population of a number of 
(‘ountries is given, bars representing the population of 
different countries may be further sub-divided into tw^o parts 
in proportion to males and females, one portion being shaded 
black. Technically, in these two examples we would say 



280 


statistics; theouy and practice 


that the bar showing death rates has been super-imposed on 
that showing birth rate;, or bars showing males and females 
have been super-imposed on the bar showing the total popu- 
lation. 

Table 35 gives the value of exports and imports of India 
in total merchandise for three years. These exports and 
imports may be added up, and bars proportionate to the totals 
may be drawn, the three bars being of unequal height because 
of the inequality of the total sea-borne trade. These bars 
may now be sub-divided into two portions, the lower one in 
each one of the three bars showing the exports and painteil 
black and the upper portion, remaining white, showing the 
imports. Thus three comparisons wdll be possible at a glance 
— vi:.. those relating to export.s. imports and total trade in 
different years. 

Another method may be to draw three bars of equal 
length to show the total foreign trade which in each case 
may be put down at hundred; these bars should then be sub- 
divided according to the percentage which exports ami import.s 
in each year bear to the total foreign trade of that year. 

Let us suppose that the three bars arc 5 inches in length 
each. The total foreign trade for 1923-24 is about Ks. 600 
crores, and the exports and imports are respectively Rs. 363 
crores, and Rs. 237 crores, .so that they are respectively 60 per 
cent, and 40 per cent, of the total foreign trade. The bar of 5 
inches for 1923-24 would, therefore, l)e divide*! int*> prop*>rtion 
of 3: 2 to represent 60 per cent, and 40 per cent, respectively 
This single bar now represents percentage Valin's of the 
imports and exports to total foreign trade of India sepiirately. 
The difference between this method and the former method 
should be carefully noted. The former method makes possible 



Df AGIUM M ATI C REPRESEX T ATION 


281 


the comparison of actual values of imports with exports and 
of imports or exports with the total foreign trade; while the 
latter method makes possible the same comparisons in percent- 
age values. 

If, however, the aim is to show the balance of trade the 
method of sub-divided bar diagram can be applied to sub- 
dividing either the bar for exports or for imports, whichever is 
greater into two poi-tions, the one representing the imports or 
expoi'ts whichever is less and the other showing the balance 
of trade, positive or negative, as the case may be. 

Table 35. Value of Sea-borne Trade of India in Total 
merchandise {including Govt. Stores) , 


Year 

]mpt»rts 

Exports 

Balance of 
Trade 


(’r(»res of Rs. ('cores of Rs.Crores of Rs. 


2H2..59 

248.65 

- 33.94 


•24r).19 

316.07 

-t- 69.88 

1P2.54 

237.18 

' 363.37 

4 126.19 


To represent the figures of exports, imports and balance 
of trade given in table 35 a common horizontal base line is 
chosen in figure 4. For 1922-23 and 1923-24 two bars with 
their heights in proportion to the respective amount of exports 
are drawn; then, from the bottom of the two bars heights in 
proportion to imports are cut off and painted black. The 
heights of the full bars represent the values of exports, of the 
coloured portions those of imports, and of the remaining white 
portions those of favourable balance of trade indicated by plus 





282 


statistics: theohy and practice 


sign in eoliinin 4, table 35. For the year 1921-22 tho imports 
are greater than the exports. Therefore, first a bar representing 
the value of imports for tho year is drawn on the same vei^tical 
se.ale. From the bottom of this bar a height equalling exports 
is cut off and left blank, the small portion at the top being 
coloured black. The height of the full bar indicates total 
imports, white portion in it represents exports and the 
coloured portion at the top show^s unfavourable balance of 
trade. Were, however, the exports in a particular year 
exactly equal to the imports, the bar would be painted black, 
without the necessity of being sub-divided. The height of the 
bar would indicate total imports as well as total exports, 
meaning that the balance of trade was nil. Figure 4 makes 
possible the comparison of actual values. Comparison of 
percentage values can also be made by drawing the bai*s on 
jereentage basis in the manner indicated .‘ibove. 

V ohit of .sfo-hortu fnuJt ot Indio in fofoi m* rohondt.\t 
(inrludinff tffo f nniu nt-sforts) 



Fig. 4 


A .single bar may be sub-divided into more than two sub- 
divisions as well. Table 36 gives the proceeds, cost, and 



DI A(i K A M M ATI C KK PRESENTATION 


283 


profit and loss per table during three years. In the year 1938 
there is a gain of Re. 1, in 1939 there is neither gain nor loss, 


Table 36. Proceeds, Cost, Profit or loss, per table during 
1938, 1939, and 1940. 


193S 1939 


1940 


Particula rs 

Its. 

% 

R.S. 

% 

i 

1 

1 Rs. 

% 

i 

Piocpods pi‘i' fable 

10 

100 

15 

100 

20 

100 

Cost per tal)le — 
W a ges 

i 

4..') 

45 

7.5 

50 

10.5 

52.5 

Other (’osts 

3.0 

30 

5.1 

34 

7.0 

35.0 

Polishing 

1.5 

15 

2.4 

16 

3.5 

17.5 

Total Oo>t 

9.0 

90 

1 

15 

[ 

100 

21 

105.0 

Profit ( r ) or loss 

4-1 

^10 

. . i 


-1 

-“5 

( — ) per table 

1 





while in 1940 there is a loss of Re. 1. It desired to lepresent 
the given data by sub-divided bar diagram using the 
pereentages. 

In table 36 the pereentages of wages, other eosts and 
polishing to proceeds per table are shown for the different 
years. In 1938 the profit is lO^r of the proceeds and in 1940 
the loss is onl^ of the proceeds, although the actual 
profit and loss in both the cases are the same, viz., one 
rupee. Sub-divided bar diagram drawn on the percentage 
basis would, therefore, be better in so far as comparisons of 
relative values are concerned; but it would not be suitable if 





284 


statistics: theory and practice 


actual values are to be compared. Figure 5 shows the diagram 
on percentage basis. 


Pereeniaiii-a of font of, and profit & losn on, a fahlr 
in IMS. Ift.'iO and 1!*40. 



Fig. 5 

The construction of the diagram to represent the data in 
table 36 should be carefully studied. First, three bars to 
represent the proceeds per table are drawn and made equal 
to 100. The proportions of wages, other costs and polishing 
are then cut dow'n from the bars such that the same order 
for each of them is maintained in the bars. The surplus in 
the bar for 1938, just above the horizontal base line indicates 
profits; there is no surplus in the bar for 1939. In the bar 
for 1940 the deficit of 5% has been made good by extending 
the btr below the horizontal line in minus direction, the 







D J A(i R A M M ATI C RE PRESENT AT 1 ON 


285 


portion below the horizontal line showing the loss. Compari- 
son in terms of percentages is very easy from the figui-e. 
Bars could also l^e similarly drawn to compare actual values. 

Although bars may be used for showing the sub-divisions 
of a large number of totals, it is not advisable to adopt the 
bar method for comparison if there are more than three or four 
components of each total, because in that case even if the 
same order is followed in the sub-divisions in each bar, the 
disparity among the figures may place them wide apart so 
that one type of component would not be opposite the other 
in all cases, and therefore it may not be possible to make 
comparisons at a glance. 


Two Dimensional Dia^grams — Rectangles. 

The breadth of the bars, though shown in the diagrams, 
was so far left out of consideration. It would be utilised now 
in diawing re<‘tangular diagrams. A rectangle has two dimen- 
sions. Hence, its aiea, and not the height alone as in the 
case of one dimensional diagram, is taken to represent a 
magnitude. If several magnitudes are given, they may be 
represented by separate rectangles whose bases are equal but 
heights proportional to the given magnitudes so that their 
areas would stand in the same ratio as the given magnitudes. 
Reetaugles are suitable for use in eases where two or more 
(luantities are U) be compared and each quantity is sub-divided 
into several components. 

Table 37 gives the monthly incomes of two families and 
their expenses on different items. The data would be properly 
represented by rectangular diagram. The incomes of the 
families are Rs. 80 and Rs. 40. Therefore, in figure 6, on the 



(M CO Tf 


286 


statistics: theoky and practice 


Table 37. Family budgets of two families. 


Items of ex- j 
penditure ! 


] . Food 
Clothing 
Shelter 
Fuel and 
light 
5. Miscellane- 
ous 


Total 


Family A. Income i 

1 Family P>, Income 


Rs. 80 



Rs. 40 


Actual 

Percent- 

C'unuila- 

Actual 

1 

Percent- 

Icumula- 

ix* 

expen- 

ses 

age 

tive per- 
centage 

expen- 

ses 

age 

jtive per- 
j cent age 

Rs. 

i 


Rs. 


I 

1 

1 

32 

' 40 

40 

20 i 

50 

i 50 

20 

25 

(>5 

S 

20 

' 70 

8 

^0 

/.) 

4 

10 

1 80 

4 

5 

80 

2 

1 

5 

85 

16 

; 20 

; 100 

6 

15 

i 100 

80 1 

100 

1 

1 

i 

40 

100 



Percnitofte of iiu'ttmf sprut ity ftnt famihts mt 
items iff {rfumtiturt. 



^ fmnfiyA9^s.0O 

Fig. <i 

same horizontal line two reetanjrles are erected sueh that the 




DIAGRAMMATIC REPRESENTATION 


287 


widths of the two are in the ratio of 80: 40, or 2: 1, and the 
heights are equal. Thus the areas of the two rectangles are 
in the same proportion as the incomes. Each of the rectangles 
is, then, sub-divided into components according to the percent- 
age of income .spent on different items of expenditure. Per- 
centages are shown in table 37. In order that the marking 
out of sub-divisions may be easy and convenient, cumulative 
percentages have been calculated. They are shown in the 
table in columns 4 and 7. The vertical scale is given on the 
left, and with its help cumulative percentages have been 
marke<l iu the rectangle.s. Thus, for family A the first 
mark is put at 40, the second at 65, the third at 75, the fourth 
at 80, so that the remaining 20 is the percentage expenditure 
on miscellaneous items. Cuniulation of percentages reiluces 
the chances of error in marking the sub-divisions to miniuium. 

The areas of the components of the rectangles are propor- 
tional to the actual expenses on various items, and the areas 
of the rectangle's are in proportion to the income. Therefore, 
comparison of percentages of income spent over different 
items in the same family and in the two families is rendeml 
eas.v. A glance at the heights of the rectangles relating to 
percentage expenditure on shelter and fuel and light would 
show that these heights are identical in the two. It implies 
that the percentage of income spent on these items in the 
two families is equal but the actual expenditure is different. 
The actual expenses stand in the same ratio as the areas of 
the components. Similarly, there might })e a case in which 
the heights of the components may be different but their areas 
equal, implying that the actual expenses are identical but 
percentage expenditure is not. the percentage expenditure 
being in the same ratio as the heights of the components. 

Rectangles are not used to represent only the family 
budgets. They are tw’o dimensional diagram.^ and can, there- 
fore, be also employed to show three different factors. 
Consider table 38. It gives (1) price of a commodity, (2) 



288 


statistics: thkoky axd practice 


its quantity sold and (3) different expenses of produetion and 
net profit. These facts have been dia|.n’anunatically i^. 
presented in figure 7, a close study of which would reveal 
that the heights of the two rectangles are in proportion to 

Table 38, Details of price, cost and quantity sold of tuo 
commodities. 

' ~ Ti 

Price of a roinniodity . . Ks. 2 per unit Ks, 3 per unit 
equality Hold . . . . 40 20 

Value of raw-materials . . Ks. 2fi Ks. 24 

Other expenses on production Ks. 32 Ks. 21 

Profits 22 Ks. lo 

the quantities sold« and the widths are in proportion to cost 
per unit, so that the areas of the rectangles are in proportion 
to the total proceeds in each case. These areas are sub- 
divided into their components, viz., expenses on raw materials 
and on pi-oduction, and profit. 

I)( fails of coat of tiro commoihto 


QtMinhty 

Sou I 







1)1 AGHA M M ATIC KKlMl^^^EVrA tlON 


289 


It should be noted that recta rigle^s are the only two 
dimensional diagrams wdiich are capable of showing three 
different factors. Squares and circles which are also surface 
diagrams ilo Jiot enjoy this property. Hence, the place of 
lectangles in diagrammatic representation is very important. 

Two DimensionaJ Diagrams — Squares. 

When it is <h*sired to compare quantities which bear the 
ratio of 1 :1()0, bar diagrams fail to answer the purpose, since 
howsoever small the scale selected l>e, the height of one bar 
will have t<» be nm<le 100 times as great as that of the other 
with the result that one bar will be too tall and the other too 
small. In such cases bars are replaced by squares. 

The sitle of a square varies as the square-root of its area. 
If. tlierefore, two figure's 100 and 10.000 ai'C to be represented 
by squares their sides would be taken in the proportion of 
\ 100 :\ 10.000 or I :10 and not in that t>f 100:10,000. The 
;i vvkvva]'<b)(‘ss id* sizes of iliagrams would thus be largely 
do!U‘ away with. 

Table 39 gives the produet ion of eaue-sugar in four 
<*(»untries for 1938-39, arrangid in dt seending order. Column 
2 of the same table gives the approximate square roots of 
figures given in column 1. Column 3 contains numbers which 
are obtained by dividing the square-roots l\v 50. These 
numbers give the sides of the squares. They ean be taken 
to ])e so many inches, ctus. or. any other convenient measure 
of l(*ngtli. Inches are gtnerally preferred. The four squares 
iii figure 8 are made to on the saiiie horizontal line with 
equal intervening spaces between them. .The squares show 
the proportionate differences, though it needs a practised eye 
to detect them. 

F.— 19 



200 


statistics; thkohy axd pkaci’KI-; 


Table 39. PAoductwn of Cane su^r for 1938-39, 


Countries 

Quintals ■ 

0000 \s j 
omitted ' 
1 

Sepia re - 
roots 

2 

i Sides of the 
i squarea in 
inches 

; 3 

! 

1. India 

1 

2750 

r)2.44 

1.0.1 

2. Neth. Ind ; Java 

1550 

;>9.37 

0.79 

3. Hawaii 

S:l5 

-N,90 

0..iS 

4. Columbia 

51 , 

1 

7.14 

0.14 


It will be seen that scale is ^ivcn alx^ve the s(|uares in 
figure 8. Let us sec how this scale is (*,al dilated. The side 
of the first square, relating to India, is l.Oo inches, whose 
square is 1.1025 square inches. This, thererore, is th(‘ ar<‘a 
of the square. It iTpresents 2,75,(K},00() quintals. Thda'fort* 
1 sq. inch represents 249,40,000 (piintals nearly. Tlii; is the 
scale to which squares have been drawn in figina^ S. 

Cnnesiuidr imuiin'txtv *n itrtain rotmtnts 

/ Jf - in • 2,4-9, 40, ooo qu/ntof* 



MJpyo 

Fin, H 


In figure 8 squar<?s are shown separately one from the 
other. Separate squares do not afl'ord a i>r<»per view of propor- 
tions at a glance, and also require much space. If a total is 





I)IA(;UAMMATK‘ RKPKKSKXTATIOX 


291 


capable of being divided into parts, the total may be shown 
by a square and its components sub-divided in it as they are 
sub-divided in rectangles. This inetho<l would not require 
much space, and would, at the same time, render comparison 
of proportions easy. For instance, given the area of the 
woi’ld and of the several continents, the area of the world may 
be represented by a square and the areas of the several con- 
tinents by its lectangular sub-divisions. The square may be 
sub-divided horizontally or vertically. 

Squares requii*e much time and labour to be drawn 
accurately. They are therefore replaced by circular diagrams. 
Circles take less time to bo drawn, and can bo drawn suffi- 
cient iy accurately. 

Circular Diagrams — Circles. 

Tlio area of a circle varies as the .square of its radius. 
It the radius of a circle is twice that of another, its area 
would be four limes the area <d' the other. Similar is the 
<‘.ase with s(iuares. If the side of a square is twice that of 
aiiotlicr. its area would be four times the area of the other. 
It follows that if the radii of .several circles are in the same 
propoidion as the sidis of the squares, the areas of the circles 
would also be in the same proportion as the ai*eas of the 
squares. Therefore, the same numbers may be used either 
as the radii of several eireles or as the sides of several squares, 
without resulting in any difference in their comparative 
study. Thus the .squares shown in figure 8 may be replaced! 
by a?i equal number of circles whose radii would be the samel 
numbers as represent the sides of the squares. Besides, 
endes are more attraetiye to look at and, in many eases,; 
more effective to compare than squares. The ease with which 
they can be drawn is another advantage of circles. Therefore^ 



292 


statistics: theory and practice 


when there is a choice between squares and circles, the latter 
are generally preferred. 

Table 40 gives the production of ginned cotton in five 
countries, for 1937-38, in column (2). The square-roots 


Table 40. Production of Ginned Cotton, 1937-38. 


Country 

Production 
in Quintals 

Square roots 

Radii in 


0000 ’s 


inches 

(1) 

omitted 

(2) 

(3) 

(4) 

India 

1048 

32.3 

1.53 

China 

700 

26.4 

1.27 

Brazil 

460 

21.5 

1.02 

Peru 

81.5 

9.0 

.43 

Argentine 

1 

51.4 

7.2 

.34 


of figures in column (2) are given in column (3). These 
square-roots may be taken as the radii of the circles, but since 
they are too big to be shown in inches or ems., they are 
divided by 21. The nunil)ers thus obtained are shown in 
column (4). They stand for radii in inches, and have been 
used in drawing the circles in figure 9. The circles are 
arranged in descending oi*der. The proportions of productHUi 
of ginned cotton in different countries are comparable at a 
glance. In interpreting the circles in figure 9 we should not 
say that production of ginned cotton is the least in Argentine in 
the world, since there may bel several other countries, not 
shown in the diagram, which may have still less production. 




DIAGEAMMATIC BEPRESENTATION 


293 


All that can be said is that among the five given countries, 
production in Argentine is the least. 

ProUmtion of (tinned cotton in certain conntries. 

f.Sq.in * 1^14^500 quintals. 

Scale 

O I 2 

h- 

Inches 



BRAZIL 

Fig. 


Angular Diagrams — Sectors. 

Just as a rectangle or a square can be sub-divided into 
its components, a circle can also be divided into sectors to 
represent the parts of a total. The method of circles and 
sectors is to be preferred to that of sub-dividing a square, 
beeause of the suggestive and attractive character of sectors 
and circles, and also because of the ease with which they can 
be drawn. 


294 


statistics: thkoky and practice 


Table 41 gives the area under food-crops cultivated in 
1939-40 in the United Provinces and the Punjab. 

Table 41. Area under food-crops in the UJ\ cmd the Punjab^ 

1939-40. 


Food crop 

U. P. 

Punjab 

Area in 
acres 
000 ’s 
omitted 

Area in 
acres 
000,000’s 
omitted 

Angles 
of the 
sectors 

Area in 
acres 
000 ’h 
omitted 

Area in 
acres 
000,000 \s 
omitted 

Angles 
of the 
sectors 

Eice 

7764 

8 

76° 

977 

1 

18° 

Wheat 

8109 

8 

76° 

9566 

10 

178° 

Barley 

3823 

4 

38° 

730 

0.7 

13° 

Jowar 

2307 

2 

19° 

778 

0.8 

14° 

Bajra 

2388 

2 

19° 

3061 

3 

53° 

Maize 

2107 

2 

. 19° 

1143 

' 1.1 

20° 

Gram 

5399 

5 

47° 

2413 

2.4 

43° 

Other food 






1 

grains & 







pulses 

6819 

7 

66° 

1231 

1.2 

j 21° 

Total 


38 

360° 


20.2 

360° 


Figure 10 represents the data given in table 41. Two cir- 
cles are drawn, one for area under total food -crops in the U.P. 
and the other for area under the same crops in the Punjab, 
such that the areas of the two circles are in proportion to the 
areas under food-^ci*ops in the two provinces. Theii* radii are 
in the ratio of \/38:\/20.2. To divide these circles into 
sectors, the principle to note is that the areas of the sectors 
should be proportional to the areas under different crops. 
Now, the areas of the sectors are proportional to the angles 
at the centre. Therefore, 360®, the total number of degrees 
contained in a circle, are to be divided into proportional parts 
to get sectors of the required areas. 38 and 20.2, in the 
example under consideration, are put down as equal to 360®, 
and by the simple rule of three, the number of degrees con- 



1) 1 A( ; RA :M M ATI ( ^ RK rUESKXTATIOX 


295 


taiiu'd by the an^Je of the sector representing the area under 
any crop is calculated. Thus, the aimle of tlie sector for 

Utidt r ditjrant food crops in the U. P. oiul the Punjab, 


iS^.mrn 2B,oo, ooo acres. 
Scale 



U.R 

^ofe — Fij>ur(‘H insi<Io tlic sec'tors stniul for 
Fig. ]() 


Q 

area under rice in the U.P. is equal to — X 360= 76° nearly. 

38 

Similarly, angles of other sectors standing for other crops 
in the two provinces are found out. Having determined the 
angles of ditferent sectors, we show them at the centre of 
circle. The process would be to draw any radius in a circle, 
and taking it to be the starting point, to mark off the requisite 
numl)er of degrees on the circumference with the help of the 
protractor, and then to draw radii from these marks to divide 
the circle into requisite number of sectors. 

It should be noted that we have proceeded with approxi- 
mated figures in calculating the angles of the sectors and not 
with actual figures. But, the diffei-cnce between angles found 
by using the correct figures and the approximated figures 
would be insignificant and immaterial for all practical put- 


296 


statistics: theory and practice 


poses. If greater accuracy is desired, the direct method 
where actual figures are used should be adopted. 

Scale is given at the top in figure 10. Its calculation 
does not involve any difficulty. The radii of the two circles 
are in the proportion of \/38 : V20.2, that is 6.1 : 4.5. We have 
taken 2.03 inches as the j-adius for the circle for U.P., and 
1.5 inches for that for the Punjab. The area of a circle is equal 
to rrr^; therefore, th(* area of the circlei for the ]\injab is 


equal 




square inches. It represents 


2.02.00. 000 acres. Therefore, 1 square inch i*epresents 

28.00. 000 acres. This is the scale to which circles have been 
drawn. In figure 10, comparison of the areas under all food 
crops ill the V.V, and the Piinjal) can be made at a glance by 
looking at the two (‘ircles, and comparison of the areas under 
different crops can be made by a glance at the sectors and 
their angles. 


li ihe components ot! a total are too many, too many 
sectors will be required to represent them. The .circle will 
in such a case become complicated and lose its effectiveness. 
In order to avoid it, figures below a certain quantity may be 
grouped together so that the number of sectors may be reduced 
to mamigcahle number. 


Ill figure 10, it will be seen that in the circle representing 
the area under food-crops in the U.P., the sectors are arranged 
in order of magnitude. This arrangement has the advantage 
that even if minute diffirei.ees in areas for different crops are 
ignored in the calculation of the angles of the sectors, they 
will be clearly indicated by the order in which the sectors 
occur in the circle. For instance, the areas under rice and 
under wheat in the U.P. are, respectively, 77,64,000 and 
81,09,000 acres. The difference is very little and has to be 
ignored in the process of calculating the degrees of sectors; 
but this difference is accounted for in the circle by placing 
wheat before rice. Similarly, the differences between the 



DIAGRAMMATIC REPRESENTATION 


297 


figures for jowar, bajra and maize are so slight that they give 
equal degrees for their sectors, but by placing bajra before 
jowar, and jowar before maize the fact that although their 
degrees are the same, their magnitudes are in the order in 
which they have been arranged in the diagram is made clear. 
Again, it will be noted that the arrangement adopted in the 
second circle, namely, that for the Punjab, is just the same as 
in the case of the first circle, the reason being that only in 
this manner would proper comparison between areas in the 
two provinces under the same crop be easy and convenient. 
It follows, therefore, that when only one circle is drawn to 
represent components of one magnitude, the sect ()!*s in it should 
be arranged in some order — ascending or descending — and, 
when two or more than two circles are drawn, the sectors 
should have the same order in all the circles. 

It is, no doubt, theoretically possible to use circles and 
sectors for showing the distribution of different incomes over 
different items of expenditure; but in practice it is not done 
for two reasons. Phrstly. it is easier to draw sub-divided rect- 
angles than to draw Sub-divided circles, for the calculation 
of angles of sexdors involves considerable labour. Secondly, 
the heights of sub-divisions of a rectangle are measurable on 
the scale showing percentages, and are, therefore, comi)arable 
directly in percentages; while the sectors of a circle are 
comparable directly in terms of angles and only indirectly in 
terms of pei*centages. Therefore, rectangular diagram is to 
be preferred for presenting family budgets to circular diagram. 
Circle divided into sectors, however, is a good diagram for 
showing the distribution of world population into various con- 
tinents or of world area into areas for different continents. 

Three Dimensioxial Diagrams — Cubes. 

When quantities which have the ratio of 1 :ip00 are to be 
diagrammatically represented, even squares and circles, to 
say nothing of bars, fail* to serve the purpose, for if circles or 



298 


statistics: theory and practice 


squares are drawn in such a case their radii or sides will liave 
to be in the ratio of 1 :32 which dimensions are difficult to 
show on the same scale. In eases like this surface diagrams, 
like the square and the circle, are abandoned in favour of 
volume diagrams or three dimensional diagrams, such as 
cones, blocks, spheres, cubes etc. Of course, cubes ai-e the 
easiest to draw and are generally used. The sides of cubes 
are equal to the cube-root of the data to be presented. The 
sides of two cubes representing 1 and 1000 will be in the 
proportion of */]T '\/ iqoO. or 1 :10. Cubes shoul.l be used 
only in those cases in which the data cannot be adequately 
presented through bar or surface diagrams. 

Tablo 42 gives the production of tea in certain provinces 
of British India and Coorg. The numbers giving tne 

Table 42, Production of Tea in some Provinces of India and 
Coorg in 1939. 


Pi'oviiiccs 

Production 
in '000 l))s. ' 

1 ; 

Production 
{ Reduced 
figures) 

Cube.^^’f 
. , cube in 

inches 

Bengal 

1 

.. j 1,12,290 ^ 

1 1 

864 

9.6 

1.9 

Madras 

.. 1 28,872 

299 

: 6.7 ! 1.2 

Punjab 

.. ^ 2.807 

22 

i 2.8 ' 

1 

.6 

Bihar 

1,225 

! 

10 

i 2.2 ' 

1 ! 

.4 

Coorg 

. . ; 120 ' 

1 

1 1.0 1 

.2 


production are very great, and, therefore, they are reduced to 
smaller ones by dividing each of them by 130, the lowest 
number in the table. Cube-roots of these reduced figures 
are then found out, but since the values of the cube-roots are 



DIAGRAM M ATI C RE PRESENTATION 


299 


sufficiently large, each of them is divided by 5 and the resulting 
figures, given in the last column of the table, constitute the 
sides of the different cubes. 


The construction of a cube requires explanation. Suppose 
the side of a cube is 1.9 inches, as it is in the case of Bengal. 
Fii*st, a square with its side equal to 1.9 inches is drawn. Then 
another square of the same size is placed behind it in such 
a way that a quarter of each of the two squares just covers 
each other. This being done, the corresponding corners of 
the two squares are joined and the construction of the cube 
is complete. In figure 11, the method of <*onstructing the 

Production of Tea in 1939 in aomc Provinces of cube is given 
India and Coorg. separate- 

ly. Figures T 
and II show the 




/^Mras 

Fig. 11 


])ositi()n of two 
squaies ABCI) 
and abed 



having equal 
lengths of their 
sides, abed is 
t h e n placed 
over ABCD in 
figure 111 in 
such a manner 
that ad and dc 
are bisected by 
AB and BC 
and ad and dc 
are then rubbed 
off, and the cor- 
responding 
points A and a, 


B and b and C and c are joined. The sides to be rubbed off 




300 


statistics: theory and PltACTICE 


are shown by dotted lines in figure 111. ADCcba is the 
required cube. 

In this manner all cubes have been drawn in figure 11 to 
represent the data given in table 42. They are all drawn on 
the same scale and are, therefore, comparable with one 
another. 


Pictograms — ^Maps and Pictures. 

The device of pictures is being profusely used now for 
comparing statistical data. One conies across pictorial re- 
presentation of facts very often in co-operative courts of 
various exhibitions held in the country. Tliey are also being 
very much used for effective propaganda purposes. The 
reasons for their coming into popularity ai*e not far to seek. 
They present dull masses of figures in interesting and attrac- 
tive manner through objects of daily observation. The 



Fig. 12 


DlAGliAMMATlC MiPRESENTATION 


301 


image of the entir^e data is fixed in the- mind of the observer 
by a mere glance at the picture. He need not consult any 
scale as is done in the case of reading diagrams. Relationship 
betweep figures and their comparison can be studied through 
pictograms much more easily than by studying huge mass of 
numerical data. 

Figure 12 shows the density and distribution of popula- 
tion in India by religion. It shows two facts at one and the 
siiine time, vh,, it shows the density of population in different 
provinces, and it exhibits the proportions of people belonging 
to the prominent religions of the country. Prom a study of 
the map it will be found that the total numbe,r of marks for 
different religions for the Punjab is 18 which means that the 
density of population in the province is 180 persons per square 
mile. Out of these 18 marks, 6 stand for Hindus,, 10 for 
Muslims and 2 for Sikhs. Hence out of 180 persons per 
scpiare mile in the, Punjab 60 are Hindus, 100 are Muslims and 
20 are Sikhs, (Uiristians and others being insignificant. It 
will ])e seen in the map that the mark chosen for each religion 
in itself coi’responds with the sign of the religion concerned, 
c.g., the Swastika for the Hindus and the Moon and the 
Star for the Muslims. This enables us to compare at a glance 
the proportions of different religions along with the density 
of population. Instead of using religious signs we could have 
put down typical human figureis representing different reli- 
gions. Of course, the size of the map would have been larger, 
but at the same time, the figure would have been more interest- 
ing and attractive. 

Figure 13 contains pictures of two money bags, one 
showing the expenditure on industries and the other that on 
police by the 11 provincial governments in India in 1940-41. 
The expenditures are respectively Rs. 115 lakhs and 1120 lakhs. 
The money bags are not drawn at random. They are enclosed 
in squares whose areas are in the proportion of 115 : 1120, or 
whose sides are in the proportion of V115: ^1120, or 10.72: 



302 


STATlSTlc:S: TI1>X)RY AND PRACTJCK 


33.47 or 1:3 approximately. A mere glance at ihe bags 
enables a comparison of the amounts spent on the heads of 
industries and police by provincial governments in India. 

The money bags could have been placed in bars or rect- 
angles whose width may have been equal and heights in 
proportion to the amounts. But, then the shapes of the bags 
would have looked unnatural, and insteiad of being attractive 
the pictures would have become repulsive. Great caution is, 
therefore, necessary in presenting statistical data in pictorial 
form. Pictures cannot be used indiscriminately. Serious 
thought must be devoted before using them, lest they might 
look unnatural or ridiculous, or mis-iepresent the data. 
Pictures are usually enclosed in squares, circles or rectangles 
for thei reason that it is not possible to calculate the surface 
area of pictures. The pictures of the bags, for example, are 
very irregular so that their areas cannot be easily determined. 
They have, therefore, been put into squares. 

Fh'loqram shoivini) exptmhiure of lh( eh vrn pxnincMl 
In huhct in 1940-41 on Industries and Police. 



Pig. 13 

General Remarks. 

A number of ways in which diagrammatic methods may 
be useful in presenting statistical facts have now been con- 
sidered. Forethought with regard to the suitability of a parti- 
cular form in a given case, and practice in drawing diagrams 





DIAGRAMMATIC RKTRESENTATIOX 


303 


are essential factors. Bars and circles, it will be found by 
experience, are the easiest to draw and suitable for general 
use. A particular case, however, may necessitate a square, 
a rectangle or a cube. Attention, in drawing diagrams, 
should always be fixed upon their neatness and on the precision 
with which they represent facts. As has been already said, 
continuous changes over a period of time are best shown by 
graphs, and not by diagrams. Diagrams should therefore be 
used in discrete or non-continuous series. 

EXERCISES 

(1) What is meant by diagrammatic* representation of facts? 
What is its importance? 

(2) How far is diagrammatic representation an advantage 
over statistical tables? 

(li) What are the ditiennl form.s of diagrams? Explain in 
detail the construction of any two of them. 

(4- ) What precaulions are necessary in drawing a good diagram.^ 
What is the test of a good diagram? 

(5) What mistakes are generally found in the diagrams? 
How would you avoid thcmi? 

(6) Show with the help of a few examples that diagrams 
can be wrongly used. 

(7) Write short notes on: 

Ear-diagram. pie-diagram, three-dimensional diagram, 
pictogram. 

f8) What kinds of statistical data are best represented by 
diagrams? Illustrate your answer with examples. 

(B. Com., Agra, 1937). 

(9) Illustrate the following by suitable diagrams: — 

(a) In 1931 twelve seers of wheat could be had for one 
rupee, while in 1943 a rupee would purchase only' two-and-a-half 
seers of wheat. 

(b) 120 out of every 1,000 of the population of India were 
literate in 1941, as against 95 ten wars ago. 

(c) ■ I II 

Price of a commodity .. Rs, 10 per unit Rs. 12 per unit 

Quantity sold . . 20 24 

Value of raw' materials used . . Rs. 100 Rs. 120 

Other expenses of production „ 60 „ 96 

Profit . . . . 40 „ 72 



304 


statistics: theory and practice 


(10) Represent the following statistics of successful University 
Graduates in Arts and Science in the form of bar diagrams: — . 


Provinces 

1916-17 

1980-ai 

Madras 

1.200 

2,100 

Bombay 

700 

1,100 

Bengal 

2.200 

.S,100 

U. P. . . . . I 

! 700 

2,100 

Punjab 

1 600 

1,.S00 

Biliar and Orissa 

200 

400 

Total 

i 

5.600 

1 

1.0,100 


(11) Represent the following data by suitable diagrams: — 


Pdectricity sold in British India in 1940-11 and 1941-42 



1940-41 

1941-42 


units (000) 

units (000) 

Domestic consumption . . 

156,916 

138,301 

For offices and other uses 

10a,50.S 

109.805 

Industrial Power 

i,37;i,6;n 

1,603.487 

Street lighting 

46,419 

32,563 

Tramways . . 

37,471 

46,315 

Electric Railways 

161,534 

315,223 

Miscellaneous 

60.S80 

110,934 

I'otal units sold 

1,940,624 

2,356,628 


(12) The following figures relate to the postal traffic in India 
in 1940-41 and 1941-42. Represent tliem by suitable diagrams. 






DIAGRAMMATIC REPRESENTATION 


305 


Articles 

1940-41 

1941-42 


Number (000) 

Number (000) 

Letters 

529,096 

541,528 

Postcards 

365,458 1 

413,096 

Regd. Newspapers 

78,535 

80,578 

Books and pattern packets 

110,703 

99,613 

Unregistered Parcels 

3,324 

3,426 


(IS) The following is the distribution of scholars in institu- 


tions for females in India 


1939-40. 


Assam 
Bengal . . 
Bihar 
Bombay 
C. P. 
Madras 
N. W. F. P. 
Orissa 
Pun j ab 
Sind 
U. P. 


54.891 
519,735 

82.891 
248,880 

56,549 

452,059 

18,977 

19,878 

202,180 

40,368 

159,099 


Total . . 1,894,590 


Represent the above data by suitable diagram. 

(14) The following are the percentages of expenditure on 
education in 1939-40 for three provinces: 


Sources 

Assam 

Bengal 

Bihar 

Govt. Funds 

54.64 

34.20 

29.47 

Local Funds 

13.09 

7.30 

28.39 

Fees 

21.66 

42.70 

28.04 

Other sources 

10.61 

15.80 

14.10 


Represent the above figures by suitable diagram. 


F.~20 







306 


statistics: theoky axd practice 


(15) Value of the imports of glass and glassware into India 
from different countries — during the year 193 1-32. 


Japan 

42 

lakhs of rupees 

Czechoslovakia . . 

23 

V fy 

Germany 

20 

y jy 

U. K. ‘ 

13 

yy jy 

Belgium 

13 

yy 

Other countries . . 

11 

?y 


Represent the above figures by suitable diagrams. 

(B. Com., Alld., 1933), 


(16) Draw a simple diagram to represent the following 
statistics relating to the area under different crops in British India 
in 1933-34, and write a brief note on the given data: — 


Crop. 

1 

Million Acres. 

Rice . . 

i 

80,3 

Wheat 


27.6 

Jowar 


21.4 

Other food crops . . 


88.2 

Oil-seeds 

* • ! 

17,8 

Cottons 


14.5 

Other fibres 

. . 1 

3.1 

Fodder crops 


10.2 

Other non-food cro[)s 


3.9 


(B. Com., Cal.. 1937). 
07) The following table gives the birth rates and deatli rates 
of a few countries of the world during the year 1931: — 


Country 

Birth rate 

Death rale 

Egypt 

44 

27 

Canada 

24 

11 

U. S. A. 

19 

12 

India 

33 

24 

Japan 

32 

19 

Germany 

16 

11 

France 

18 

16 

Irish Free State 

20 

14 





DIAGRAMMATIC REPRESENTATION 307 


Country 


Birth rate Death rate 

United Kingdom 


16 

12 

Soviet Russia 


40 

18 

Australia 


20 

9 

New Zealand 


18 

8 

Palestine 


53 

23 

Sweden 


15 

12 

Norway 


17 

11 

Represent tht‘ above figures by 

a suitabl 

e diagram. 



(B. 

Com., Luck., 1938). 

(18) The following table 

gives the details of monthly expendi- 

ture of three families: — 




Items of Expenditure 

P'amily A P'amily B P'amily C 


Rs. As. 

Rs. As. Rs. 

Food 

. . 12 

0 

30 0 90 

Clothing 

2 

0 

7 0 35 

House- rent 

2 

0 

8 0 40 

P^dueation 

1 

8 

3 0 12 

Litigation 

1 

0 

5 0 40 

Conventional necessity 

0 

8 

3 0 60 

Miscellaneous 

1 

0 

4 0 23 

Represent the alwvt* figures by 

a suitable diagram. Which 

family is spending the money 

most wisely.^ Give reasons. 



(M.A. 

Ecoii., Alld., 1937). 


(19) Tlu* following table gives the details of the cost of cons- 


truction of a house in Allahabad 


Land 


Rs. 

. . 4,500 

Labour 


. . 2,500 

Bricks 


. . 2,000 

Iron 


.. 1,800 

Timber 


.. 1,500 

Cement 


800 

Lime . . 


800 

Stone . . 


600 

Sand . . 


200 

Other things 


. . 1,300 


Represent: the above figures by a suitable diagram. 


(B. Com., Alld., 1941). 



308 


statistics: theoey and practice 


(20) Represent the following by a suitable diagram: — 



1933 

1938 

1943 


Rs. 

Rs. 

Rs. 

Proceeds per chair 

12 

12 

20 

Costs per chair: 




Wages 

7 

7 

10 

Other costs 

. . 5.5 

4- 

6 

Polishing . . 

1.0 

1.5 

2.5 


13.5 

12.5 

18.5 

Profit or Loss per chair —1.5 

~ .5 

+ 1.5 

(21) Illustrate the following by 

suitable diagram 


Production of Cotton in 

i^gypt- 



1911—12 


3111000 tons 


1911—15 


6450000 tons 


1918—19 


10873000 tons 



(22) Following are the figures of the population of the various 
countries of the world and of total world population in 1931; — 


Country 

Population 
(OOO’s omitted) 

China 

411,770 

India 

352,370 

U. S. S, R. 

161,000 

U. S. A. 

124,070 

Germany 

64,776 

Japan 

64,700 

U. K. 

46,077 

France 

41,860 

Italy 

41,100 

CXhers 

706,077 

World 

. . 2,012,800 

Represent the above 

figures by a suitable diagram. 

(23) Illustrate the following data diagrammatically : — 

Area in (000) square Kilometers of the continents of the world. 

Continent 

Area 

Asia 

41,900 

S. America 

40,687 




DIAGRAMMATIC REPRESENTATION 


309 


Continent 
Africa 
N. America 
Europe 
Oceania 
Others 

World 


Area 

29,946 

19,658 

11,426 

8,550 

2,764 

154,926 


(24) Represent the following figures diagrammatically ; 


City 

Calcutta 

Madras 

Lahore 

Lucknow 

Benares 

Peshawar 

Tinnevelly 


Females per 
1 .000 males 

464 

908 

596 

516 

781 

708 

1.068 


(25) Show diagrammatically the balance of trade in cotton 
piece-goods from the following data. 


1939- 40 

1940- 41 

1941- 42 


Exports Imports 

million yds. million yds. 


221.3 579.1 

390.1 447.0 

779.4 181.5 


Also show in separate diagrams 

(^a) the proportion of exports and imports to total trade in 
1941-42. 

(b) the proportion of exports to imports in 1939-40. 



CHAPTER XV 


GEAPHICAL PRESENTATION 

Diagrams and maps discussed in the preceding chapter 
are particularly suited to the comparison of variables spread 
over different places or different heads at the same time. For 
illustrating series spread over a period of time and also for 
illustrating frequency distributions, graphical methods are 
made use of. In this chapter, therefore, we shall discuss 

(1) Graphs of Time Series showing continuous changes, 

(2) Graphs of Frequency Distributions. 

Diagrajumatic and Grapibic Presentations Contrasted. 

In the diagrammatic method, bars, rectangles, circles etc. 
stand for quantities individually or in groups. In the 
graphical method quantities are not represented by one or 
more dimensional figures, but are located on a surface with 
respect to two or more; dimensions, for which purpose a sys- 
tem of rectangular co-ordinates, such as that given on page 311 
is used. 

Two straight lines x xf and y y' iuter.secting each other at 
right angles are drawn. The horizontal line is the x-axis or 
abscissa and the vertical line the y-axis or ordinate. The 
junction of the axes at o is known as the point of origin or 
zero. Distances measured towards the right or upwards from 
the origin are reckoned as positive, and those measured 
towards the left or downwards as negative. All points in the 
plane, i.e., the four quadrants into which the gi-aph is divided, 
are located by reference to two co-ordinates drawn parallel 
to the axes. 

The scales of mesasurement on the axes may be chosen at 
convenience. There is no necessary connection between 

310 



GRAPHICAL PRESENTATION 


311 


A System of Rectangular 
C<»-OTdinates. 


X- 


the A' and the v scales. In figure 14 the scales of a and y axes 
ai'e equal and the points P, Q, E, S have been located as 
follows : — 


Point 

X 

y 

P 

+ 6 

+2 

Q 

-2 

+3 

B 

-4 

-5 

S 

+4 

-4 


The co-ordinates on the points are indicated by dotted 
lines. But the dotted lines need not be made visible; only 
the point should be shown. In practice, graph papers, which 
iiave a net-work of fine lines, are used for graphical presen- 
tation. By using them the necessity of drawing dotted lines is 
dispensed with. 

It is evident from the above that to locale quantities in a 
plane bounded by the x and the y axes in the manner illustrat- 
ed in figure 14 is not the same thing as to rep^resent them by 
bars, squares, rectangles etc. The latter are themselves drawn 
proportional to the amounts which they represent. 




312 


statistics: theory and practice 


All truly continuous series are properly presented by 
graphical as distinct from diagrammatic methods, (jraphic 
methods are more powerful than diagrammatic ones in so far 
as they not only present the facts 'effectively but also bring to 
light new relations that may not at first sight be visible from 
a study of the quantities themselves, for instance, we may 
determine through graphs whether phenomena are connecte<l 
or independent. 

GRAPHS OF CONTINUOUS TIME SERIES 
Rules for drawing Graphs. 

A continuous series may be measured (1) in time, or (2) 
in space, or (3) be represented by frequencies of a variable 
at the same time or place. We shall first deal with the 
graphical presentation of the first type of series, viz., time 
series. 

Time series is also called historical series, since it stands 
for the numerical record of the changes in a variable during 
a number of successive intervals in a period of time, ^he 
first problem to be considered in drawing a graph of such series 
is the choice and adjustment of scales. 

Choice and adjustment of scales. — A system of reiv 
tangular co-ordinates as illustrated in figure 14 is used to illus- 
trate time series. Time units are placed on the abscissa or x- 
axis, and the sizes of variable measured on the ordinate or y- 
axis. Since time has no zero, the horizontal scale need not 
begin with zero; the first time interval may be indicated near 
or away from the point of origin or o. The A:-axis should be 
divided into equal parts, each of which should represent 
periods of equal length. The vertical scale should begin with 
zero when sizes of variable are shown on it. since they are 
always reckoned from zero. Equal space units on the y-axis 
should represent equal amounts when natural scale is used, and 



GKAPHICAI. PUESENTATIOX 


313 


€qual rates of chang^ei when ratio scale is used. Just now, we 
shall be concerned with the natural scale. 

What should be the proportion between the abscissa 
and the ordinate scales? Bowley states the problem and the 
way in which it should be solved as follows: — 

It is difficult to lay down rules for the proper choice of the scales 
by which the figure should be plotted out. It is only the ratio between the 
horizontal and vertical scales that need be considered. The figure must 
be sufficiently small for the whole of it to be visible at once; if the figure 
is complicated, relating to a long series of years and varying numbers, 
minute accuracy must be sacrificed to this consideration. Supposing the 
horizontal scale decided, the vertical scale must be chosen so that the part 
of the line which shows the greatest rate of increase is well inclined to the 
vertical, which can be managed by making the scale sufficiently small; 
and, on tlie other hand, all important fluctuations must bd clearly visible, 
for which the scale may need to be increased. Any scale which satisfies 
both these conditions will fulfil its purpose. 

ThuSj^ the scales chosen should be such as would allow the 
full data to be presented on thej graph, would properly show 
the extreme fluctuations and would clearly bring out the 
changes over the entire period from, date to date. The two 
scales selected will, no doubt, depend upon thei size of the 
paper ; still it should be carefully noted that if the units occupy 
too much space, small changes in the size of variable will 
appear to be, important fluctuations while, if the units occupy 
too little space, even large fluctuations would look unimpor- 
tant. The respective scales will, it is obvious, be different for 
different data. No one standard can suit all cases, yet, it is 
desirable, as a general rule, to have the ^.-axis approximately 
times) as long as the j-axis. 

Having decided the proportion of the two scales, the ordi- 
natq scales should be divided into units such that they would 
be easily comprehended in terms of the rulings of the paper 
used. For instancci, if the paper is ruled in fifths or tenths, 
ten small squares should not be made equal to such an amount 
as 4357. A given space should equal some multiple of ten or 

“Bowley, A. L., EJeme^ifs of Sfai^stics, 1920 ed., p. 132. 



314 


statistics: theory and practice 


five, as 3000, 250, 25 etc. When the scale has been divided into 
units, the ordinate should be labelled in terms of the scale unit, 
i.e., at each equal space the value of it in terms of the scale 
unit should be put down. One should not try to fill up the 
scale with too many details and the putting down of each suc- 
cessive frequency — or every frequency that is plotted — should 
bei avoided. 

The natural scale is thus ready. We shall deal later with 
ratio scales and false base line. Presently, the problem of 
plotting the data is taken up. 

Plotting the data. — To plot the given data the inethud 
shown in figure* 14 will be followed. To plot the size of an 
item against a particular date, only a point placed on the ordi- 
nate concerned is enough. Thus when the whole data are plot- 
ted, there will be as many points on the plane as the number 
of dates. Since time is a continuous factor, these several 
points should be connected from date to date by continued 
smoothed lines, each point being simply a conventional stop- 
ping place. This continuous smoothed line is called the curve; 
it shows the probable changes at all possible intervals of the 
entire period to which the data relate. This curve is also given 
the name of Historigram. This Jiam-e must be distinguished 
from histogram into the construction of which the factor of 
time does not enter. But drawing a smooth curve requires 
practice and skill which everyone does not usually possess. 
An alternative, which is commonly adopted, therefore is that 
of connecting the points plotted by straight lirves.^ When the 
curve is plotted, it should be given a short, but adequate title, 
indicating what it represents. And, if sevei'al curves are 
plotted each shpuld be differentiated either by using different 
inks or by adopting some such devices as drawing straight 
continuous line, dotted line, dot-and-bar ^ line, if using the 
same ink. An explanation of what the different inks used or 
lines drawn indicate in the historigram should be separately 


® In mathematics the term curve includes a straight line. 



GRAPHICAL PRESEXTATION 


315 


given in a corner of the paper as given in figure 16, or the name 
of the factor represented may be written on the curve itself 
if it does not spoil the figure as done in figure 30. 

Different T3rpes of Graphs on the Natural Scale. 

The purpose of graphical presentation is comparison. Coin- 
j)arisoiL is necessary to have a clear idea of the relationship 
of things in time and space. Our aim may be to study: 

1. Changes of a single variable. 

2. Changes of two or more variables. 

These changes can be studied on the Natural Scale or 
'‘Difference’* Charts and on the Ratio Scale or “Ratio” Charts. 

First we take up the natural scale graphs. On it, changes 
of a single variable are studied through (1) Absolute histori- 
gram, and (2) Index historigram. And changes in two or more 
variables are also compared through (1) Absolute historigrams, 
and (2) Index historigrams. In addition to these, we shall 
take up a discussion of a few other ways of drawing diagrams 
for comparison of different variables and of the false base line. 

Absolute Historigram of one Variable. — When original 
(jiiantities, and not index numbers, are presented graphically, 
the resulting figure is called absolute historigram as distin- 
guished from ail index historigram which relates to an index 
number series. In figure 15, amounts of treasury bills 
tendered in India from week ending 10th Pebiuary, 1942 to 
week ending 28th April, 1942, given in table 43, "^are shown, 
tine small division represents on the .r-axis one week, and oil the 
}-axis Rs. 5,000,000. These scales give an outline which is 
neither too flat nor too angular and shows the fluctuations 
clearly. Since all the quantities in table 43 aie positive, only 
the north-east quadrant of the system of rectangular co- 
ordinates has been utilized. (Inly one variable has been 
treated in the figure and changes in its size are comparable 
through the gi*aph from week to w’eek. Thus, the amount 
of treasury bills tendered increased from Rs. 30,000,000 on 



316 


statistics: theoey axd practice 


17th March, 1942 to Rs. 33,000,000 on 24th March, 1942, that 
is, it increased by Rs. 3,000,000. 


Table 43. Treasury Bill tenders in India, 


Week ending 

Amount tendered 

1942. 

X 


Rs. (000,000) 

y 

Feb. 

lOth 

22 

Jf 

17th 

28 

fy 

24th 

27 

March 

4th 

23 

yy 

lOth 

24 

yy 

17th 

30 

yt 

24th 

33 

yy 

31st 

31 

Apr. 

7th 

30 

yy 

14th 

24 

yy 

21st 

27 

yy 

28th 

35 







GKAPHICAI. PRESENTATION 


317 


Index Historijfram of one Variable. — When a time series 
consisting of index numbers is given, it ma,y be graphically 

presented in just the same manner as the series consisting of 
absolute or actual values. In figure 28, an index historigram 
relating to the retail price of wheat in India is shown along 
with other curves. An important difference betweien index 
historigram and absolute historigram is that the latter shows 
the actual values and therefore studies the movement of abso- 
lute sizes of the variable from date to date, while the former 
shows the index numbers and therefore studies the change on 
a particular date as compared with the base year. This latter 
change is studied not in the absolute size of the variable, not 
in the unit in which the variable may have been measured but 
in percentages of the base year. Thus, from figure 28 we can 
study that retail price of wheat in 1877 was two per cent, more 
than that in 1873, the base year ; we cannot study the absolute 
amount of money Iiy which the increment was effected. 

Absolute Historigrams of two or more variables, (Homo- 
geneous units). — If two or more variables mea.siired in the 
same unit are given, all of them can be exhibited on the 
same graph, with common vertical and horizontal scales. 
Figure 16 shows values of monthly exports, imports and 
balance of trade of India for 1932-33. The three curves are 
drawn in the same manner as shown in figure 14. It should 
be iwlad th«t since some ot the vslues in the balance ol trade 
series, shown in column 4, table 44, are negative the extern 
or the right half of the system of rectangular co-ordinates 

has been utilissed. 



318 statistics: thkohy and vuactjck 

Table 44, Imports, exports Qnd balance of trade of India 
during 1932-33, (In crores of Rs.) 


Month 

X 

Imports 

y 

^ Exports 

y 

Balance of 
trade 

1 y 

April . . 

13 

11 

-2 

May . . 

12 

10 

-2 

June . . 

12 

10 

-2 

July . . 

11 

^ i 

-2 

August . . , 

11 

10 I 

-1 

September 

11 

13 1 

+ 2 

October . . [ 

10 

12 i 

+ 2 

November . . i 

11 

12 : 

+ 1 

December . . i 

10 

13 ; 

+ 3 

January 

11 

12 

+ 1 

February 

9 

12 

+ 3 

March 

11 

13 

+ 2 


In fiji’ure 16, (see next pa»e) comparison can he made 
only of the ahs(»]utc amounts of exports, imports and balance 
of trade, since the histoi*i<>rams show absolute values. It will 
not be possible to compare proportional changes betw(‘en 
them durino- the same period. 

Absolute Historigrams of two or more variables, (Hetero- 
geneous units). — If two or more variables are measured in 
different units, all of them can be exhibited on the same graph, 
but with different ordinate scales, the horizontal scale being 
ooninion. The ordinate scales will have to be different for the 
simple reason that the variables are not measured in the 
sani'C unit. With this difference only, the curves shall be 
prepared in just the, same manner as they have been in 
figure 16. A study of the changes in the absolute amounts of 
the same variable from one date to another will be possible, but 
comparison of changes in one variable with those in the other 
during the same period will not be possible. For, of two vari- 




(iUAiailCAL PRESENTATION 


319 


ables if one is measured in tons while the other in yards, tons 
will be comparable with tons and yards with yards, but not 
tons with yards. 


Values of Monthly Exports, Imports and 
Balance of Trade of India m 1932-33. 



Index Historigrams to compare changes of two or more 
variables. — In the ease of absolute historigi’ains of two or more 
variables expressed in the same or different units, it has been 
noticed that only absolute sizes are comparable, and it is not 
possible to study from them whether change in the value of one 
variable from one date to another is similar to or different from 
the change in another variable during the same period. For 
instance in figure 16, it is possible to note that between April 
and May exports fell from Rs. 11 crores to Bs. 10 crores and 
imports from Rs. 13 ei‘ores to Rs. 12 crores. That is, in both the 




320 


statistics: theory and practice 


ciases the fall is by an equal amount of a crore of rupees. But, 
is the proportional decline the same in both the eases? This 
information cannot be had from the figure referred to. To 
compare proportional change., an easy way will be to reduce 
Ithe two or more given variables to index numbers on the same 
I 6ase, and then plot index historigrams. All the curves be’ng 
reduced to like bases, it is easy to compare the proportional 
changes in relation to the base in the difiPeienl variables during 
the same period. If figures relating to exports and imports 
during May, to refer to table 44* are converted into index 
numbers, it is found that with April figures equal to 100, the 
index for exports for Ma<y is 91, and that for imports for the 
same month is 92. Thus, it is possible to see that the propor- 
tional decrease from April to May in exports is greater than 
that in imports. The fact that the absolute decrease in both, 
exports and imports, is the same in no way obscures the record 
of the comparative proportional change. In index histori- 
I grams it is advisable to draw a line parallel to the :t-axis from 
5 the point 100 on the ordinate, so that propoi'tional change on 
r any date from the base, whose value is put down at 100, may 
be seen at a glance. 

But, as was noticed in dealing with index historigram of 
a single variable, index historigrams are no improvement on 
the absolute historigrams if it is desired to compare different 
X)eriods in the same series with regard to the relative changes 
therein. All index numbers compare the change with the 
base year and not between themselves. Therefore, index 
curves of exports and imports, with April as the base in the 
example under consideration, shall exhibit the comparison of 
proportional change with April ; study of proportionate 
changes from August to September, or from December to 
Januariy will not be possible. It may now be observed that 
index historigrams aid in comparison of the fluctuations of 
different variables at a certain particular date in relation to 
the base, but the function of clearly exhibiting relative 



(JKAPHICAL PRESENTATION 


321 


c^ha^es over periods of time is reserved for logarithmic 
historigrams, which we shall study later. 

Method of Scale conyersii(m for oomparing' changes in two 
or more variables. — It has been noted above that when vari- 
ables are expi*essed in different units, or even in the same 
unit, they may be converted into index numbers to compare 
prox)ortioual ehanges relative to the base. Several methods* 
of comparing the diffei-enees between two or more variables' 
during a parlieular period are, however, available. These 
relate to converting one seale into terms of the other scales. Of 
these methods we take np one below, whieh is particularly suit- 
a])le in a case whcri* variables are exi)ressed in different units. 

The method is very simple. When two or more variables 
are given, separatt* scales may he chosen for different vari- 

Table 45. Volume and Value of exports of lac from India in 1941-42, 


, Month 

Volume 

Value 

X 

y 

r 


Cwts. 

Rs. 


(000) 

(00.000; 

April . . 

53 

22 

May . . 

80 

34 

June . . 

89 

40 

July . . 

96 

50 

August 

j 56 

33 

September 

I 69 

43 

October 

j 32 

23 

November 

I 60 

48 

December 

22 

19 

January 

102 

83 

February 

60 

51 

March 

49 

46 


ables, but each should be made proportional to the respective 
averages of each. Table 45 gives the volume in cwts, and 
value in Us. of lac exported from India month by month during 

F .— 21 





statistics: theory and practice 


1941-42. First, averages of these two series are computed 
which are, respectively, 64,000 tons and Rs. 41,000. These 
average values are plotted' in figure 17 on two separate 
vwtical scales in such a way that the average values of the 
two are represented by the same position on the vertical 
scales. After the scales have been thus adjusted, the values 
of the variables are plotted. Each of the two curves should 
be read in terms of its own scale. 

One difference between this method and the method of 
index historigrams discussed above is that in this method 
actual values are plotted, while in the other method values 
relative to the base year — ^index numbers — are plotted. In 
figure 17 it is easy to compare the two series since their 
Volume amd Value of ICxforis of Lae from hidh in 1941^42, 


VOLUME 

ooo Cmis- 


Value 

Rs (OOQ0OJ 



April May June Ju 


Pig. 17 


averages lie on the same line. It will be seen that in 
figure 17 false base line has been taken to adjust the scales. 




GRAPHICAL PRESENTATION 


323 


False Base lime. — In fig:ure8 15 and 16, the scales on the 
ordinate begrinning from zero are continuous, while in figure 
17, the scale is broken or discontinuous. In figure 17 we 
have used the false base line. 

In those eases in which the fluctuations are small relatively 
to the size of the variable and the insignificance of those 
fluctuations is to be visualized, and in those cases where ad- 


Imiex numbera of prioes of Oovemment Sec^uritim 
during 1937*42. 



t9S7 /9$$ t940 mi 042 


Months 
Fig. 18 


Justment of scales is not otherwise possible (as in fig. 17), 
instead of showing the entire vertical scale from zero to the 
highest value involved, only as much of it is shown as is 
just sufficient for the purpose. That portion of the scale 
which lies between zero and the smanest value of the variable 
is" omitted. This has been done in figure 17 and is further 
illustrated in figure 18 which shows the index numbers of 



324 


statistics: theoey and practice 


prices of Government Securities duiing 1937-42. The range of 
th^ series is (123 — 104) =19, and the fluctuationa are small. 
To /amplify these fluctuations the vertical scale shown is only* 
between the limits 103 and 124. On this wide vertical scale 
the fluctuating movement in the price of Government Securities 
is brought into prominence; The fact that a false base line 
has been taken and the vertical scale is not shown in its 
entirety should always be made clear on the graph by the 
double saw-tooth line, as shown in figures 17 and 18, or by 
some other device. If a continuous vertical scale beginning 
from zero and going up to the maximum value of the index 
numbers were used to plot the indices of the prices of Govern- 
ment Securities, a slightly waving line would have resulted 
and the fluctuations would have been quite inconspicuous. 
Similarly, in figure 17 adjustment of the scales would not have 
been possible without resorting to the false base line. 

False base line should be used in rare and exceptional 
cases. It is always safe to show the zero line, for a correct 
study of the proportional changes is i>ossible only when the 
zero of the ordinate in the graph is shown, and the height 
of every point on the curve is shown in full. For example, 
the proi>ortion between two numbers, 100 and 400, if expressed 
in full a])ove the zero line is 1:4. But if the vertical scab* 
from 0 to 50 is omitted, each of the two figures will be re- 
duced by 50. If the horizontal line passes through 50, the 
respective magnitudes of these two numbers above the 
horizontal line will be 50 and 350 whose ratio will not be 1 :3 
but 1 : 7. Thus, wrong impressions might be created in regard 
to proportions, if a portion of the vertical scale is omitted, 

if a false base line is taken. This would, however, not 
happen when the significance of the false base line is kept in 
view. 

Not infrequently people resort to false base line either 
to economise space or for want of space. Thereby they un- 
consciously run into the error of making fluctuations look 



GEAPHICAL PEESENTATION 


82 S 


larger than they really, are. Spmetim*e^ this is done deliberate- 
ly. At other times, figures are plotted against too wide a 
vertical scale to make the increase or decrease appear larger 
than is really the case. Therefore, while studying curves and 
commenting on them these facts must not be ignored. 


Graphs on Ratio Scale. 

So far, we have been dealing with the natural scale 
graphs in which the y’s are scaled in proportion to their actual 
values. We have seen that this method shows absolute move- 
ments in statistical series, but fails to exhibit relative move- 
ments in their proper perspective. The importance of the 
study of relative changes in economic investigations is grow- 
ing in recent times. To study relative changes the Ratio scale 
Natural Seale eonirafrted with or Logaiithmic scale is employed 

as an alternative to the Natural 
scale. 


Nahrat or 
Difference 
Scale 
fOOr 


Hath Scalt. 

Percentage or 
Paho Scale 


80 


€0 


40\ 


20 


320 3200 32000 


\I60 1600 16000 


490 800 8000 


420 20r 2000 


Fig. 19 


Tlu* difference between these 
scales is that with the natural scale 
equal distances on the ordinate re- 
present equal absolute movements, 
while with the ratio scale they 
represent equal proportionate move* 
ments. 

Ratio Scale. — Ratio scale is 
based on geometric progression, 
while the natural scale is based on 
arithmetic progression. This fact 
is borne out in figure 19, which 
compares the natural scale with 
the ratio spale. 


The importance and Usefxilness of the ratio scale can be 



326 


statistics: theory and practice 


-esasily seen. Let us suppose that the population of a 
certain town inci*eased as follows: 

Year Population Increase 

1920 200,000 

1930 300,000 oO per cent. 

1 938 400,000 33.3 per cent. 

The absolute increase in the two periods under study is 
identical, or J 00,000 in each case, but the proportional increase 
differs, for in the first case it is 50% of 1920, and in the latter, 
33J% of 1930. If, then, these population figures are ploMed 
on the natural scale, the increment in population (100,000) in 
each case will be shown by equal distances, thereby leading 
to the conclusion that the increments between 1920 and 1930 
and between 1930 and 1938 were equal. Hut the ratio graph 
will indicate that the increase took place at the rate of 50% 
and 33J% respectively. 

Logarithmic Curves. — Rales of changes may ))e graphi- 
cally presented in either of two ways : 

(1) by plotting the logarithms of the amounts on a 
natural scale 

(2) by plotting the amounts themselves on a logarithmic 
scale. 

The latter method is simple and preferable because the 
exact moaning of logarithms of numbers is not generally 
grasped, and also because logarithmic papers are available on 
which merely the amounts may be plotted. 

Logarithmic curve is also termed as Semi-loganhmic 
curve for the simple reason that one variable (usually y) is 
plotted on a logarithmic scale, while the other variable 
remains upon the natural scale. 

Table 46 shows how a sum of Rs. 10 borrowed by A and 



GRAPHICAL PRESENTATION 


327 


another sum of Rs. 100 borrow«ed by B in 1934 increase at the 
rate of 20% per annum, compound interest being charged on 
both the sums. Figui*e 20 shows a graph of the two series on 
the natural scale. From the figure it appears that the rate of 


S’umfi of Us. 10 and Ks. 100 ri.^ihg at compovnid 
intcre.^t the Natural Seale. 


Amount 



Fig. 20 


increase in series is more rapid than in series A. This, 
however, is not the case as is shown in figure 21, in which the 
two series are drawn on the ratio scale, logarithms of the 
values of the two variables being plotted. The equal percent- 
age increase is properly and clearly brought out in the ratio 
chart, figure 21. If logarithmic paper were used, actual 
amounts would have been plotted on it. The resulting curves, 



328 


statistics: theoey and vractice 


again, would have run parallel to each otlier indicating a 

uniform rate of increase. 

Sums of Bs, W and Rs. JOO rising at compounil 
interest on the Ratio Scale. 



Table 46. Increment of sums of Rs. 10 and Rs. 100 at 20% 
p. a. compound interest. 


Year 

A 

B 


Rs. 

Rs. 

1934 

10 

100 

35 

12 

120 

36 

14.4 

144 

37 

17.3 

172.8 

38 

20.7 

207.4 

39 

24.9 

248.9 

40 

29.9 

298.5 

41 

35.8 

358.2 

42 

42.2 

421.6 








GRAPHICAL PRESENTATION 3^ 

.. iBstmctioins for reading of Logaiithmic Cmlves. — The 

following general rules will help the reading of lagarithinic 
h istori grams : — , , 

1. If a curve rises upwards, the rate of growth is in- 
creasing. 

2. If a curve is falling downwards, the rate is decreasing. 

o. If a curve is asecnding but is nearly straight, the 
magnitude it represents is growing at a nearly uniform rate. 

4. If a curve is descending but is nearly straight, the 
magnitude it represents is decreasing at a nearly uniform rate. 

5. If a curve is a straight line the rate of change is uni- 
form or constant. 

6. If a curve is steeper in one portion than in another 

portion, the i*ate of change in the former is more rapid than 
in the latter. i 

7. If two curves on the same ratio chart are found run- 
ning parallel they represent equal percentage rates of change 
(see figure 21). 

8. If one curve is steeper than another on the same ratio 
chart, the rate of change in the former is more rapid than in 
the latter. 

When comparison is made between percentage of increase 
and percentage of decrease directly, it is essential to remember 
that a loss of 50% is not made good by a gain of 50%, but by 
that of 100%. Great care should, therefore, be exercised in 
comparing increases and declines on i*atio charts. However, 
if increases are compared with increases and deci’eases with 
decreases no such care is required. 

Advanta^s and disadvantages of Ratio Scale. — ratio 
scale has no ?ero since it compares relative rates of change. 
A natural scale has a zero because it compares absolute values. 
Consequently, zero line is necessary in the natural scale and 
quite unnecessary in the logarithmic scale. Since there is no 
zero in a ratio seale, there is lio danger of fallacious conclu- 



330 


statistics; thkoey and PKAC^'ICE 


sions being drawn from the graph. We have seen how ialla- 
cious conclusions might be drawn from graphs drawn on 
natural scale whose zero line is omitted. 

Rjitio scale makes extrapdatijOit — finding out a future pro- 
bable figure — possible if the data are oi*ganie in character. If 
population figures of a certain country are given and an‘ plot- 
ted on the ratio scale, the curve may be extended in continua- 
tion with its trend beyond the last date to a next date to o))tain 
thereby a fairly accurate estimate of what the next figure is 
likely to be. In figure 21, the curves A and H showing the 
amount at compound interest have been extended in continua- 
tion with their trends and it is possible to read from the dotted 
line that the amounts at compound interest in 194»> will 1><‘ 
Es. 50.6 and Rs. 506.4 respectively. 

The logarithmic scale is specially' important in the case 
of index tustorigrama. They should generally be drawn on 
ratio scales, because index numbers are more cx)ncei‘ned with 
proportionate changes than with actual ones. Index numbers 
plotted on the natural scale convey false impression. For 
example, if price index numbers for three succ^essive years are 
100, 130 and 160, each succe^eding number differs from the 
preceding one by 30. This difference would l)e I'epresented 
by equal distances on the natural scale, so that the rise in 
prices would appear to be equal. But, a change from 100 to 
130 implies an increase of 30%, while that from 130 to 160 
means a rise of 23-1/13 per cent. Therefore, the difference 
between the indices when plotted on the i’atio scale would 
be shown as 30% between the fi?*st and the second yeai* and 
23-1/13 per cent, between the second and the third year. 
The percentage changes, of the two periods will then be com- 
parable. Such comparison, it was pointed out while dealing 
with index historigrams drawn on natural scale, is not pos- 
sible when simple, and not logarithmic, index historigrams 
are used. Thus, logarithmic graph is very useful for relative 
comparisons in point o f, tim e. 



(iRAlMlICAL PRKSKKTATION 


331 


l^ogjirithmic f»‘raphs have two disadvantages. Firstly, 
they are no good for comparison of the absolute sizes of dif- 
ferent variables. Secondly, negative values can not be shown 
on the logarithmic scale. It may also be added as a third 
disadvantage that the ordinary reader is unfamiliar with 
logarithms and logarithmic graphs and is. tl^erefore, unable 
to interpret what ratio charts imply. 

General Remarks. 

We have studied the graphical methods by w’^hich conti- 
nuous s^u'ies spread ovei- a period of time can l)e presented 
and changes in a single variable or in two or more variables 
can be studied. It should be noted that we have taken no 
account of the study and comparison of short-time and long- 
time changes in a time series. It would be taken up in the 
next chapter. 

We have also discussed that the changes in a time series 
can be graphically studied in their absolute values through 
absolute historigrams, in their values relative to the base by 
index historigrams, and in their J>rogortmnaJ^.^ values by 
historigrams — a])solute or index — drawn on the ratio scale. 
We have, therefore, studied the methods by which compari- 
sons can be made in point of time. We have not studied 
the statistical natui’e of a group, which may be a third object 
of onr study, the first two, as already pointed out, being the 
study of changes in a single variable and that of changes in 
two or more variables. We now^ proceeil to take up the 
study of statistical nature of a group and, according^^^ dis- 
cuss the graphical methods by whi<*h frequency distributions 
are presented. 


FREQXTENCT GRAPHS 


Frequency distribution may l>e discrete or continuous* 
A table giving frequency distribution of a group presents 



9S2 


STAi'ISTJC S: TH>X)RY AND PltACTICE 


the data in compressed form, but many people can normally 
appreciate the relative sizes of a number of quantities .more 
readily when they are graphically presented than they can do 
by looking at a table, (h-aphical presentation may, therefore, 
be very well employed as an addition to the method of tabu- 
lation in bringing out the statistical nature of a group, 
whether discrete or continuous. 

Statistical Nature of a Group. 

In Chapter X a good number of tables relating to fre- 
quency distribution of groups are found. In all of them one 
characteristic would be observed: It is that fi(‘(iiieiuM#‘s 
rise up to a certain maximum point, and begin to fall after 
this point is reached. Another point to be notieed is 
that this rise and fall shows certain regularity. The question 
that naturally arises is whether this regular rise and fall 
noted in the tables in Chapter X is simply an arbitrary as- 
sumption, a characteristic of the particular frequency distri- 
butions referred to, or a feature common to many varieties 
of phenomena, it woiihi be found that this feature is 
common to many or most phenomena. 

Let us pick up ,a good number of leaves from any tree 
at random, measure their lengths, and tabulate the lengths 
in certain well-defined groups. Or, let us take a few rupee 
coins, toss them, and see how many times only one coin falls 
with head upwards, how many times two fall in the same 
manner, how many times three, and so on. lii both these 
cases it will be found that the frequencies begin with small 
magnitude, rise up to a certain maximum and begin to fall. 
The former is a case of natural phenomena, and the latter of 
pure .'chanee, and yet the rise and fall of frequencies would 
occur in identical manner in both of them. 

Let us take another example. Of the 111 students admit- 
ted to tlie\B.A. class of a certain -college in a particular year,^ 



GRAPHIC Al. PRESENTATION 


38S 


55 students were picked up at random, their heights measured, 
and tabulated as in table 47. 


Table 47. Height 55 boys arranged in ascending order. 


Serial 

No. 

Height 

in 

inches 

Serial 

No. 

Height 

in 

inches 

Serial 

No. 

Height 

in 

inches 

j 

Serial 

No. 

Height 

in 

inches 

i 

Serial 

No. 

Height 

in 

inches 

1 

59 

12 

63.5 

23 

66 

34 

67 

45 

68.5 

2 

60 


64.5 

24 

I 66 

j 

35 

1 

1 67 

46 

1 68.5 

3 

i 61 

14 

64.5 

25 

; 66 

36 

67.5 

47 

‘ 69 

4 

61.5 

15 

64.5 

26 

66 

37 

67.5 

48 

69 

5 

62 

16 

65 

27 

66.5 

38 

67.5 

49 

69 

6 

62 

17 

65 

28 

66.5 

39 

1 

67J5 

50 : 

i 69.5 

7 

62.5 

18 

65 

29 

66.5 

40 j 

68 

1 

51 ; 

69.5 

8 

62.5 

19 

65.5 

30 

66.5 

41 i 

1 

68 

52 1 

70 

9 

63.5 

20 

65.5 

31 

67 

42 ; 

! 

68 

53 

70 

10 

63.5 

21 j 

65.5 

32 

67 

43 ! 

68 

54 1 

71 

11 

63.5 

22 ^ 

65.5 

33 

67 

44 { 

! 

68.5 

55 j 

1 

72 


From the table it will be found that (1) the heip:ht of 
67 inches is repeated the Impest number of times, (2) as we 
proceed on either side of 67 the number of times each height 
is repeated is 2 iot only less than the number of times 67 
inches is repeated, but also the number of times each height 
occurs goes on falling, so that (3) a large majority of heights 
group round 67 inches. 67 inches is the modal height. 

From these examples we can deduce the following law 
which would apply to most other cases: Phenomena tend 
to flnctizate about a norm known as the mode, and a large 
majority of items duster round it. As the distaoKoe from 
the mode widens, the items become fewer. 





334 STATlhTlCS; THKORY AND PRACTICK 

W'e can arrive at this conclusion graphically as well. 
We may plot the array of lengths of leaves or of the results 
of tossing coins. We have plotted in figure 22 the array of 
the height of 55 boys. In this graph we see that near the 
extremes the heights change rapidly, while the fluctuations 
are not so marked in the middle. W e turther see that the 
mode is fi7 inches since the largest number of lines stand for 
this number in the figure. 

Arroit of htifjhi of boys of .17 yaos rhosiu 
at random. 


/ S to ts to 25 30 3S 40 48 so SS 
Seriaf nutter of heights of boys orranycdm asctndmy order 


One might like to know whether the same result would 
follow if the heights of the remaining 56 students were also 
imasured. That is, would the lo<*ation of the mode be affected 
by the inclusion of more items? The answer is that if the 
55 boys who were selected represented a fair sample of the 
whole class, then the use of a larger number would give a 
greater regularity in the variation of the sizes, and results' 
not materially different from the ond we have arrived at 
would follow. 



GKAPHICAL PRESENTATION 


335 


From fignu^e 22 it is very easy to locate the values of the 
median and the two quartiles. The line drawn parallel to 
the base from the height of the 28th boy cuts the scale at 65.5 
inches which is the required value of the median. Similarly, 
quartiles can be located, as shown in the figure. 

Frequency Graphs for Discrete Series. 

The simplest mode of illustrating a discrete series 
is the line or bar frequency diag^m. Size of item is taken 


tifte frequency Dmgram. 



Fig. 2:t 

Oil the horizontal line and the frequency on the vertical 
scale. Table 21, chapter XI, gives a discrete series which 
is graphically presented in figure 23. Instead of draw- 
ing simple lines as done in the figure, bars of uniform thick- 
ness could also have been drawn to' improve the appearance 



336 


STATIISrriCS:- THEORY AND PRACTICE 


of th-e figure. Sizes of items should be, as they are in the 
figure, separated on the horizontal scale by sufficient blank 
space so that neighbouring lines become absolutely distinct 
from each other. 

Frequenpy Qrapihs for continuous series: Histogram. 

If in a series the range, that is the differenct^ 
between the largest and the smallest items, is very large 
and instances occur at a great number of points between 
the two extremes, the above method of the line or bar 
frequency diagram is not suitable to follow, for it is impraeti* 
cable to place a line at each measurement. In such cases the 
data must be divided into classes and each class treated as a 
whole. Table 48 gives such a classification of the data rclat* 
ing to height of 55 boys given in table 47. 

Table 48. Frequency distribution of the height of 55 hry)s. 


Height in inches 
(size of item) 

No. of boys 
(frequency) 

59— «1 

2 

61—63 

6 

63—65 

7 

65—67 

15 

67—69 

16 

69—71 

8 

71—73 

1 


This frequency distribution can be illustrated by Die 
rectangulax diagram or histogram as shown in figure 24. 
The histogram is composed of a set of rectangles one over 
each class interval on the horizontal scale. The heights of 
the rectangles are in proportion to the fi^uencies in the 
class. The area thus enclosed is bounded by the lines of the 




GRAPHICAT. PRESENTATION 


337 


ordinates, the base line and the parallels to base line at the 
top of each class-interval. 

Histogram representing the heights of 55 hoys 



The rectangular histogram has a few characteristic 
features. The series of rectangles in the figure illustrates 
fairly aocurately the relative size of the various groups. The 
entire distribution of frequencies among the several classes 
become at onoe visible. The histogram is a better representa- 
tive of the height of 55 boys actually mfe^asured than any 
smooth curve (figure 25) would be, although the smooth curve 
would be a better representative of the heights of 111 students. 

The total area of the rectangle erected on each class* 
interval is exactly equal to the number of frequencies in that 
class, the unit of area being measured by a rectangle one 
frequency unit in height and one class-interval in width. 
Thus the wea of, figure equals the total numbers of fre- 

F .— 22 


388 statistics: theoky and practice 

'quencies. These two are, no doubt, significant facts of the 
rectangular histogram, but it is not without its defects. 

One defect of histogram is that different groupings would 
give different shai>es. If the class-intervals in table 48 were 
made narrower, the steps in the histogram would decrease in 
size. Secondly, it suggests, for instance, that there are 

2 students each » *'■*'*’ inches high, 6 students 62 

inches high, and so on. As a matter of fact, each group 
consists of boys having different heights and therefore the 
xectang^ar presentation is misleading. To do away with 
these defects, a system of smoothing the histogram has to be 
devised so that a curve as typical of the entire data as possible 
may result. For this purpose, Frequency Polygon or Frequen- 
cy Curve may substitute the histogrank 

Frequent Polygtm. 

A simple method of smoothing is simply to connect the 
outer extremities of the base of the histogram with the mid- 
points of the tops of the rectangles, as is shown by the dotted 
line in figure 24. In the figure, the lines connecting the mid- 
points of the tops of the rectangles have been extended to 
‘ the base at points 58 and 74 inches, the mid-points of the two 

• rectangles outside the histogram, at which the observed fre- 
quencies are zero. This procedure gives an area representa- 
tion of the frequency distribution which is exactly equal to 
the area of the histogram. The triangular strips of area which 

'are excluded from the histogram are equal to those formerly 
outside the histogram but now included in the polygon. 

• (Compare' a and o', b and 6', c and <f, etc.) Thus the 
-area of the polygon is equal to the area of the histo- 
- gram. But the area of each rectangle of the histogram is not 
-equal to the area of the corresponding section of the polygoh. 
■ For, tha area cut off from the class-interval 61-63, say, has been 
'added to the prece^hg class, S9-61. To this extent, the poly- 



GSAPHICAL PRESENTATION 830 

gon may be said to have re-distributed the Irequeucy diatn- 
bution. 


The Frequency Polygon and ihe Frequency 
Curve representing the heights of 55 hoys. 



The main purpose of the polygon is to find the mode of 
the given series. Mode can be ascertained fairly accurately 
by the apex of the polygon. Apex would in all probability 
occur in the class-interval containing the mode, and will not 
be shifted greatly even if some more items are added to the 
series, so that the original frequency distribution is modified. 

The frequency polygon, however, has a defect. It shows 
sudden changes in the direction of the curve, particularly at 
the apex, P. It, therefore, fails to show regular and uniform 
variation in magnitude, which purpose is well served by a 
eontinuous smooth curve, called Frequency Curve, 



statistics: thborv and psactice 
Oiove, 

Frequency polygon gives the first approxinuttion to 
making a continuous smooth curve. In figfnre frequency 
curve, also called smoothed histogram, has been drawn free- 
hand and is shown by the continuous line. Smoothing free- 
hand requires great care, which, if not taken, would lead to 
fallacious presentation of facts. When smoothing a frequency 
polygon the fact that it is really derived from the histogram 
should always be kept in view. This would imply that the 
top of the curve would overtop the highest point of the poly- 
gon, particularly when the magnitude of class-intervals is 
large. Again, the curve should look as regular as possible; 
all sudden turns should be avoided. The extent of smoothing 
I>ermissible would, however, depend on the particular data 
under study. If the data consist of records of natui-al pheno- 
menon, like the measurement of leaves, or of chance pheno- 
menon, as the tossing of rupee coins, smoothing may be freely 
resorted to, since such phenomena normally have a symmetrical 
curve, but if the phenomenon under study is social or econo- 
mic, skewness, sometimes considerably large, may be expected 
in the normal curve. In smoothing such a jmlygon only 
minor irregularities may be eliminated. The smoothed histo- 
gram should begin and end on the base line, sinoe a continu- 
ous' series, which it represents, begins with very few instances 
which go on rising but decrease again slowly to zero. As a 
general rule, it may be extended to the mid-point of the class- 
intervals just outside the histogram. Another fact that must 
be kept in view while smoothing a frequency polygon is that 
the area under the curve should represent the total number 
of frequencies in the entire distribution. In the matter of 
smoothing, experience is the best teacher. 

Frequency curve has certain characteristics. In most 
elites, particularly in natural and chance phenomena, it is 
b^l-shaped. Bell-shaped curve is also called the Kormal Tre^ 
quency durTe or the Normal Ovrve of Error. Normal curve 



CBAPHICAL PEESENl’ATION 


341. 


indicates what is exi>ected of the phenomenon to whidi' the 
curve relates under normal conditions. This curve eliminates 
accidental variations and establishes normal tendencies. If 
such a curve has been once obtained with adequately represen- 
tative sample, it can be utilized to speak for the whole universe. 
For instance, it may be said that if more measurements are 
taken, not only will they fall within the curve, but most of them 
would be found to cluster round the mode. Or, if the groui>s 
are re-arranged, those groups which are nearer the modal 
group will contain, as a rule, more cases than the groups more 
distant. 

To draw the frequency curve it is necessary first to draw 
the polygon. The polygon is later smoothed out. Frequency 
polygon may be drawn, even without first drawing the histo- 
gram, by plotting the frequencies at the mid-points of the 
class-intervals and joining them by straight lines. This is no 
doubt easy but presents difficulties in smoothing the polygon 
properly. Therefore, it is always safe to proceed in a sequeboe 
— first draw the histogram, then the imlygon and lastly 
smooth it keeping in view the fact that the area of the .cdryitj 
should equal that of the histogram. 

Ogive Curve. j 

Of the three methods of presenting frequency distribution; 
— the histogram, the frequency polygon, and the frequency; 
curve — the last is the best for many purposes. But these- 
methods are based on the frequencies of the class-intervals and- 
not on the cumulative frequencies. Ogive curve is based on, 
cumulative frequencies and is, therefore, also designated as, 
cumuJtetive frequent^ curve. 

Table 49 gives cumulative frequencies of the frequ^SPt 
distribution of the values of 204 shares of the Imperial ;i^nk. 
of India, takeit week by week from Ist Januwy 1933 to 
December 1933. .] 



842 


statistics: theory and practice 


Table 49. Cumulative Frequency Table shounng market 
values of the shares of the Imperial Bank rtf 
India (Paid-up Value Rs. 500). 


Value of shares 

Nc|, of shares 
(frequency) 

Cumulative 

(frequency) 

Rs. 

Rs. 



1150 — 

1200 

11 

11 

1200 — 

1250 

44 

55 

1250 — 

1300 

9 

64 

1300 — 

1350 

10 

74 

1350 — 

1400 

1 i 

1 81 

1400 — 

1450 

6 i 

1 87 

1450 — 

1500 

! 12 

99 

1500 — 

1550 

! 30 

j 129 

1550 — 

1600 

i 51 1 

180 

1600 — 

1650 

20 ; 

200 

1650 — 

1700 1 

4 

204 


To eonstruct cumulative frequency curve from the table, 
the horizontal and vertical scales are taken just as in the case 
of histogram, polygon or curve; but the essential difference 
between the plotting of frequency polygon and of ogive curve 
is that in the polygon the frequency must be plotted at the 
mid-point of the class-interval, but in the ogive it must always 
be idotted at the upper limit of the class-interval. Thus, in figure 
26 we mark 11 against Bs. 1,200, 55 against Bs. 1,250, and so on. 
The successive points are later connected by straight lines 
with a ruler. The resulting curve is an ogive curve. This 
curve can also be smoothed, much like the smoothing of fre- 
quency polygon, but it has not been done in figure 26. 

Ogive curves, or simply, ogives, may be used for the 
purpose of comparing groups of statistics in which ti^ is 
n<rt a factor. Ogives, in general, are not easy for the ordinary 
person to interpret. Histograms are readily understood by 
him. Ogives are primarily drawn determining medians. 





the vertical sdale is drawn parallel to the horizontal axis to 
inters^ the ojpve at lif • (see figure 26 ), and then a perpendi- 
cular is drawn from M on OX cutting it at N. ON, read.from 





3i4 statistics: theohy and practice 

tke gives the median which is Bs. 1,505. iSimilarly, 

ON" givei^ the value of the upper quartile as Es. 1,576 and 
ON', the value of the lower-quartile as Rs. 1,245. Deciles and 
percentiles can also be likewise determined. This method of 
locating the median etc. is far more easy and simple than the 
methods discussed in Chapter X, and is particularly so when 
the data given are imperfect. 

The ogive is useful for yielding other results las well. 
"Suppose an ogive represents the cumulative frequencies of 
income-tax-payers in a certain country. We can, from it, 
easily find out the total number of tax-payers paying not less 
than a certain sum. Again, if data relate to wages of em- 
ployees in a factory, the number of workers getting not less 
than a certain wage can be ascertained. Similarly^ if the 
govermnent of a country wishes to formulate a scjheme of 
graded retrenchment, this method of determining the number 
of employees getting not less than a certain salary would be 
found very useful. For, simply an ordinate need be drawn 
from the amount of money under consideration in the three 
cases to intersect the ogive, and the value of this ordinate be 
read on the vertical scale to know the number of tax-payers, 
wag^eamers or government employees as the case may be. 
Further, the mode can also be located on the ogive as the 
frequencies are most numerous where the curve has the 
greatest tendency to run parallel to the vertical scale. 

CkLlton’s Method of Locating the MediaoL^ 

Mr. Galton has given a graphic method by which median 
can be located. The horizontal line is divided into equal parts 
corresponding to the unit of measurement, and vertical line 
is similarly divided to show the frequency. The only essen- 
\ tial feature of this method is that every preceding measure- 
ment is made the base for tl» next -measimcment 

Table 47 gives the heights of 55 bpys arranged in ^cending 

• it should hot be confused with Galton OraplL 



GKAPHICAL PRESENT ATIOjr 


345 


order. Table 50 reproduces the data given in table 47 jji the 
form of frequency distribution. 

Table 50. Frequency Distribution of heights of 55 boyy. 


Height 

Frequency 

Height 

Frequency 

Height 

Frequency 

Height 

Frequei^y 

59 

1 

, 

63.5 

4 

66.5 

4 

69 

3 

60 

1 

64.5 

3 

67 

5 

69.5 

2 

61 

1 

65 

3 

67.5 

4 

70 

2 

61.5 

1 

65.5 

4 

68 

4 

71 

1 

62 

2 

1 66 

4 

68.5 

3 

72 

1 

62.5 

2 

1 

1 







Giil1on*s method of locating the medimn 
in a series of heights of 55 hogs. 






346 


statistics: theoky akd practice 


The frequency distribution given in table 50 is graphically • 
represented by Galton’s method in figure 27. Starting with 
59, one dot is put down on the ordinate through the 59 mark 
standing for one student. Prom this point a horizontal line 
is drawn upto 60, the next height on the ordinate through the 
point 60. Starting from the new base, only one dot is again 
marked. From this dot again a horizontal line is drawn and 
proceeding in the same manner as many dots are marked at 
each successive height as there are students of that height. 
Thus 55 dots are put down. When this has been done, lines 
are drawn to connect every two successive dots which are 
horizontally apart. Where the dots are odd in number, the 
line imsses through the middle-dot, while the line passes 
through the point midway between the two middle dots if the 
number of dots is even. Thus a continuous curve is obtained. 
To locate the median, the position of 27ith point is marked on 
the vertical scale, since the height of the 27ith boy is the 
median. From this point a horizontal line is drawn to intersect 
the curve. Prom the i)oint of intersection a i>erpendicular is 
drawn on the horizontal base to intersect it at N. ON gives 
the value we desire. Thus 66.5 inches is the median of the 
heights of 55 boys. 


EXERCISES 

(1) ‘ The wandering of a line is more powerful in its effect on 
the mind than a tabulated statement.’ Elucidate this statement. 

(2) What points must be borne in mind in drawing a statistical 
graph? 

(3) Write short notes on: 

Historigram, histogram, logarithmic curve, ogive curve, false 
base line, frequency polygon. 

(4) How will you compare the proportional changes in two 
or more variables? Give the details of the procedure you would 
adopt. 

(5) How does the Natural Scale differ from the Ratio Scale? 
In which cases should the latter scale be used? 

(6) Describe the Galton’s method of locating the median. 



OEAPHICAI. PBESENTATION 


347 


(7) The following tabic gives the value of ljiipoH« of 
Exports of India for the years 1920 — 21 and 1^1 — 22. 

In crores of Rupees. 


Months 

1920—21 

1 1921—22 

Imports 

Exports 

Imports 

1 

1 Exports 

April 

22 

28 

26 

1 

18 

May 

24 

28 

21 

20 

June 

26 

23 

19 

17 

July 

28 

21 

18 

17 

August 

31 

20 

21 

20 

September 

29 

22 

20 

20 

October 

32 

21 

23 

18 

November 

32 

19 

26 

20 

December 

32 

20 

23 

22 

January . . . . 

31 

19 

28 

28 

February 

25 

18 

20 

22 

March 

24 

19 

21 

28 


Plot the above figures on a graph paper> and show also the 
balance of trade. 

(B. Com., AUd., 1^8). 

(8) Following are the monthly cheque clearances in India 
during 1942-4*3. Present them graphically and give necessary 
comments. 


Months 

Cheque Clearances 
Crores of Rs. 

April 

181.6 

May 

219.6 

June 

188.2 

J uly 

198.0 

August 

244.6 

September . . 

218.0 

October 

236.6 

November 

2S3.0 

December . . 

238;6t 

January 

. . 262.7 

February 

221*2 

March 

218.8 - 





348 STATii^ics: thimry and practice 

( 9 ) The’ followinjj table gives Index Numbers siifce 1925 for 
Calcutta, Bombay and Karachi (base July 1914=100). Show these 
by means of a suitable (i^aph. 


Year 

Calcutta 

Bombay 

Karachi 

1925 

159 

163 

151 

1926 

148 

149 

140 

1927 

148 

147 

137 

1928 

145 

146 

137 

1929 

141 

145 

133 

1930 

116 

126 

108 

1931 

96 

• 109 

95 

1932 

91 

109 

99 

1933 

87 

98 

97 

1934 

89 

95 

96 

1935 

91 

99 

99 

1936 

91 

96 

102 

1937 

102 

106 

108 


(10) The following table shows the total sales of gold by the 
Bank of England on foreign account. Represent the data gra- 
phically on the logarithmic scale: — 


Year 


£ '000 

1910 


. . 14,488 

1911 


8,228 

1912 


9,670 

1913 


7,943 

1914 

. . 

8,027 

1915 


. . 43,076 

1916 

. . 

2,360 


(B. Com., Alld., 1932). 

(11) Represent graphically the data given below on a single 
sheet of graph paper to bring out clearly the relative fluctuations 
in the prices of'Vifi^us articles. Draw such coisclusions as you can 
from the graphs. 



QRAPHICAL PBESENTATiOK ,349 

Wholesale prices in Ca^pore. 

(in rupees per maund) 


Year 

Rice 

Wheat 

Linseed 

Gur^,. 

Cotton 

Xobi^co 

1928 .. 

7.3 

7.7 

7.0 

6.5 

34.1 

17.3 

1929 . . 

7.7 

5.5 

8.0 

7.3 

29.8 

17.1 

1930 . . 

5.8 

3.6 

6.5 

6.2 

17.3 

14.5 

1931 . . 

4.1 

2.7 

4.2 

4.2 

13.3 

11.6 

1932 . . 

4.3 

3.4 

3.5 

3.5 

14.8 

4.9 

1933 . . 

4.1 

3.2 

3.4 

3.1 

12.9 

4.9 

1934 . . 

3.7 

2.8 

3.6 

4.x 

13.2 

5.7 


Com., Alld.^ 1948). 

(12) Show the results of working of Class I railways in India 
graphically and comment thereon. 


(In millions of X) 



Capital outlay Gross Earnings 

1923-24 . . 

. . 

464 

70 

1924-25 . . 


473 

74 

1925-26 . . 


487 

73 

1926-27 . . 


505 

72 

1927-28 . . 


594 

36 

1928-29 . . 


599 

86 

1929-30 . . 


617 

84 

1930-31 . . 


627 

77 

1931-32 . . 


631 

71 

1932-33 . . 

. , 

638 

70 

1933-34 . . 

. . 

635 

72 



(B. Com 

, Agra, 1940). 

(13) Distribution of firms in Woollen and Worsted Industries 

in Yorkshire, assording to number of operatives; 


Operatives 

No. of firms 

Operatives 

No. of firms 

1—20 

380 

301—340 

24 

21 — 60 

320 

341—380 

18 

61—100 

182 

381 — *20 

16 

101—140 

147 

421 — 460 

n 

141 — 180 

92 

461—500 

9 

181—220 

66 

501—700 

19 

221—260 

39 

701—900 

15 

261—300 

30 

901— 

16 

Total Number of firms 

1884 





350 


STATISTIC^; THISORY AND PitACTICE 


Represent this distribution graphically (by means of a cumu- 
lative,. diagram) and from this graph estimate the median and 
quartiles of tke group. 

(B. Com., Luck., 1930). 

(14) Present the following figures relating to monthly imports 
(volume and value) of liquor into India graphically so as to show 
their fluctuations, and give necessary comments. 


1941-42 


Volume 

Value 

Rs. 

(000) 

Months 


(000) Gals. 

April 


463 

2648 

May 


395 

1932 

June 


358 

2113 

July 

• • 

415 

2656 

August 


330 

2104 

September 


363 

2339 

October 


466 

3106 

November 


349 

2319 

December 

• V 

209 

1626 

January 


2S0 

2107 

February 


159 

1667 

March 


362 

1993 


(16) Plot the following figures relating to population of India 
so as to show the proportionate increase in population from one 
period to another. 


Year 


Population 
(000,000's omitted) 



1872 


210 

1881 


260 

1891 


290 

1901 


. . 296 

1911 


315 

1921 

• • 

320 

1931 

• • 

350 

1941 

• • 

390 




GRAPHICAI. PUESENTATION 


851 


(16) .Graphically present the figures given in exercise 24*, 
chapter X and state whether the curve is skew.. If yes, what is 
the nature of skewness — positive or negative? 

(17) Draw a line frequency diagram from the figures given 
for group A in exercise 24, chapter XI. 

(18) Study the movement of the exports of pig iron and of 
cotton goods from the figures given in exercise 26, chapter XI. 

(19) Draw a bar frequency diagram of the figures given in 
exercise 28, chapter XI. 

(20) Draw a frequency polygon from the figures given in 
exercise ^ chapter XI. 

(21) Draw a frequency curve from the- data given in 
exercise 21, chapter XI. 

(22) The following table gives the age distribution of widows 
in India (Census Report 1931). Draw a graph showing the 
number of widows younger than any given age, and from the graph 
read off the median age of the widows and also thd upper and the 
lower quartiles. 


Years 


No. of widow'^s 

0—10 

. . 

135,862 

10—20 


718,101 

20—30 


2,456,835 

30 — 40 


4,847,631 

40—50 


6,480,259 

50 — 60 


5,908,159 

60—70 


3,743,615 

70 and over 


1,957,506 


Total 

26,247,968 


(M. A., Alld., 1942). 

(23) Locate the median of the following figures by Gallon's 
Method. 

Length of Nim leaves in inches: — 

1.35, 1.35, 1.6, 1.6, 1.7, 1.7, 1.9, 1.6, 1.5, 1.9, 2.0, 2.3, 

2.6, 2.8, 2.5, 2.3, 2.9, 3.4, 3.7, 2.9, 3.2, 3.4, 2.5, 2.8, 2.8, 2.6, 

2.5, 2.8, 2.4, 2.7, 2.7, 2.7, 2.9, 8.3, 1.8, 1,6, 1.5, 1.9, 1.5, 1.6, 

3.4, 2.7, 3.9, 8.5, 2.9, 2.1, 2.2, 2.3. 



1952 


statistics: theory and practice 


(24) In the following table arc given the quantity of white 
(bleached) cotUm cloth imported into India and the price per 
yard. Bring out, ^aphically, the relation between price and 
quantity im|>orted year by year, and comment on the relation 
indicated. 



Cotton cloth 

Average price 

Years. 

Imported 

per yard. 


million yds. 

Rs. a. p. 

1924—25 

649 

0 6 0 

—26 

465 

0 5 6 

—27 

571 

0 5 0 

—28 

657 

0 5 0 

—29 

564 

0 4 6 

—30 

474 

0 4 6 

—31 

272 

0 3 9 


280 

0 3 0 

—33 

412 

0 3 0 

(25) Following table gives the production 

of sugar in Java 

and India during 1930-1939 
figures by a smtable graph. 

in millions of quintals. Represent the 

Year 

Java 

India 

1929—30 

29 

17 

—31 

28 

20 

—32 

26 

24 

—33 

14 

28 

—34 

6 

30 

—36 

5 

31 

—36 

6 

36 

—37 

14 

40 

—38 

14 

32 

—39 

15 

27 



CHAPTER XVI 

ANALYSIS OF TIME SERIES 

In f-xamining the changes with tim« of a certain quarftity 
we are concerned with the interpretation of these changes and 
with finding how they are related to similar changes which 
are observed in other time series. For instance, Avhen we 
examine the series of index numbers below giving the relative 
fluctuations of retail price of wheat in India (1873=100), we 
naturally ask ourselves to w'hat the changes taking place are 
due and how they are related to changes in other series. 


Table 5J,, Index Numbers of the Retail Price of Wheat. 

(1873=100)- 



F— 23 


353 




354 


statistics; theory and practice 


Tread, Seasonal and Cyclical Fluctuations. 

Upon perusal of the annual average indices we find that 
there is, on the whole, a gradual increase in the retail price 
of wheat and that there are sudden breaks of large and small 
number of points in this gradual increase. We know that the 
series of retail price is a resultant of a large number of causes 
of different kinds, e.g., weather conditions, transport facilities, 
consumers^ demand for wheot, demand and prices of substi- 
tutes of wheat, and so on. AVe must consider the nature of 
these causes with a view t<^ determine their effects. 

The first cause, and a fundamental one, is that with in- 
crease in population of India the demand for wheat, as tor- 
so many other commodities, has been growing. Therefore 
there is a certain growth factor which is effecting a general ami 
gradual rise in the series. The resulting gr'adiially changing 
nature of the retail price of wheat is refei-red to as the 
secular trend of the series and this trend should be supposed 
to be linked up with the growth factor. 

Secondly, operating along with the growth factor is a 
group of causes which do not operate continuously, but in a 
regular spasmodic manner. One among these causes is the 
seasonal factor. Seasons occur in the same way every year, 
c.g., winter being followed by summer and summer by rains in 
India. Crops have their sowing and harvesting seasons; May 
and June constitute marriage season in India. The effect of this 
seasonal factor is a regular up and down movement in the 
series of figures relating to the phenomenon affected by ttn‘ 
factor. This movement is referred to as the seasonal move- 
ment. If retail prices or retail price indices week by week 
or month by month are considered, an up and down move- 
ment of this kind would be noted due to the* harvesting season 
in March-April, and this movement would be super-imposed 
on the secular trend. 

Another cause in this group, operating in a regular 
spasmodic manner, is the cyclical factor. During the 19th 



ANAI.YSI8 OF TIME SERIES 


355 


rentury a fairly regular up and down movement has been 
noticed in a good number of time series of economic data. 
These movements have been repeated at intervals of years 
ranging from 7 to 11, that is, these movements have occurred 
in a cycle. There are boom ’’ years in which the observed 
phenomenon shows upward movement, and there are ‘‘ de- 
pression or “ crisis ” years when it shows downward move- 
ment. Retail price of wheat is also affected by this “ trade 
cycle or cyclical movement. 

Lastly, in addition to the group of causes producing 
regular up and down movement in our series, there is another 
group which operates in an irregular manner. It includes 
such events as floods or raids of locusts, fires, earthquakes, 
wars, revolutions and so on which ruin the crops of the areas 
effected by them. It also includes such chance combinations 
of wind, sunshine, and rain in a certain season as may result 
in a bumper crop. While these causes do operate from time 
lo time, there is no regularity in their operation. Retail price 
of wheat is also affected by such irregular fluctuations. 

We. may conclude, then, that in analysing any given time 
series we look for three kinds of movement, viz,, 

1. (General trend. 

2. Regular Fluctuations. 

(i) Seasonal. 

(ii) Cyclical. 

3. Irregular Fluctuations. 

(General trend refers to secular or long-period changes. 
Some influences, operating steadily and persistently from year 
to year, may be causing a general tendency for figures relating 
to a certain phenomenon to increase, to decrease, or to assume 
both directions. Begulai; and irregular fluctuations refer to 
periodic changes lasting a short-period of time. Therefore, 
changes in time series may be spoken of as (1) long-time and 
(2) short-time. Long-time fluctuations include secular 



356 


statistics: theory and practice 


changes; while short-time oscillations may be classified into 
(i) seasonal (ii) cyclical and (iii) in*egular fluctuations. The 
primary task in analysing a time series, therefore, is to 
measure and isolate long-time and short-time changes. 

lleasurin^g and Isolating Time Changes. 

In order to study any one of these changes by itself, it 
seems necessary to follow the method of the physicist who 
allows only one factor to vary at a time and eliminates all other 
factors. But, the statistician can rarely control the conditions 
of his experiment and has, therefore, to be satisfied with rid- 
ding, so far as possible, the recorded data of the apparent 
effects of the extraneous causes. If we desire to study the long- 
time changes in the retail price of wheat, we shall do well to 
remove the short time fluctuations from the field. But, if we 
are interested in short-time oscillations only, w’c should elimi- 
nate all long-time changes from our series. 

Let us first be concerned with the study of long-time 
variations, or, which comes to the same thing, trend. Since 
the value dj an item on a particular date in a time series comisCs 
of the long-time plus the short-time changes, we can get a measure 
of the long-time change if we eliminate short-time change 
from the series. 

Elimination of Short-time Oscillations. 

If we plot our series on a graph paper we shall observe 
an up and down movement in the curve. The index histori- 
grani of the retail prices of wheat drawn in figure 28 is not 
a smooth curve: It is irregular. If we can smooth out thes^^ 
irregularities the short-time oscillations shall be removed. 
One way of doing it is to follow what may be called the free- 
hand curve method. 

Freehajnd Curve Method. — We may observe the up and 
down movement of the curve and smooth out the irregularities 
by drawing a freehand curve or line through the index histori- 
gram such that the curve so drawn would give a general 



ANALYSIS OF TIME SERIES 


357 


notion of the direction of the change. This freehand cui*ve 
eliminates the short-time oscillations and shows the long* 
period general tendency of the retail price of wheat This 
is exactly what is meant by trend. 

But this method has a serious disadvantage that dilferent 
people would draw the freehand line at different positiors with 
different slopes. Naturally, there will be different conclu- 
sions. Therefore, in place of this method, the method of moving 
averages may be used. 

Method of Moving Averages. — It is an alternative method 
of ridding the historigram of its fluctuations. It involves the 
taking for each year of the series, not the value relating to 
that year, but the average of the values of one, two, three or 
more years preceding and succeeding the year in question. 
If, for instance, three-yearly moving average is to be com- 
puted, the valu^^S of 1st, 2nd and 3rd years are added up, the 
sum iis divide^ 3, and the quotient is placed against the 2nd 
year; then, values of 2nd, 3rd and 4th years are added up, 
averaged, and the average is placed against the 3rd year; and, 
so on. These averages when plotted on the same graph on 
which the historigram has been drawn would smooth out its 
irregularity, show the long-period tendency and eliminate, 
short-time changes. 

What period of time should be used in calculating the moving 
average? The period would vary wdth the periodicity of 
historigrams. If a historigram appears to have a regular up 
and down movement repeated at intervals of five years, five 
years constitute the pmodicity of the historigram, and a five* 
yearly moving average shall be used to smooth out the fluctua- 
tions. Moving average method, therefore, pre-supposes a 
' period \ 

How to determine the ^period*? The best way of determining 
the period is to observe the average time-distance between 
the consecutive crests (peaks) and between the successive 
troughs of the waves of the historigram, and thus obtain the 



358 


statistics: theory and practick 


approximate wave-length. Our historigram shows a wave- 
length of nine to eleven years, prominent crests falling in the 
years 1879, 1888, 1897 and 1908, and troughs in the years 
1876, 1885, 1894 and 1904. The average period, therefore, 
may be taken as ten years, although it is preferable to use 
an odd number of years for the moving-average gioup be- 
cause of the ease of plotting the average opposite the central 
year of the group. To be more certain whether the upward 
movement repeats itself every tenth year, we operate on the 
series with moving averages of a few different periods and 
observe if any other moving average smooths out the irregu- 
larities. We begin with five-yearly moving average which is 
given in column 3, table 50. This sei-ies of moving averages 
is plotted over the historigram in figure 24. The five-^yearly 
moving average curve shows considerable fluctuations though 
they are less than those in the historigram of annual indices. 
Since it does not smooth out the irregularities it cannot be re- 
garded as showing the long-time general tendency of the series. 
If we similarly plot seven- or nine-yearly figures, we would 
even then find some fluctuations m the moving average curves, 
until we come to the 10-yearly moving averagCw Ten is an 
even number and, therefore, the ten-yearly moving average 
can be placed only in the middle of fifth and sixth years oT 
each group, as in column 4 of table 50. We ‘ centre ' these 
ten-yearly moving averages by taking two-yearly moving 
average of the figures given in column C For exami)le, the 
ten-yearly moving average (centred) for 1881 is 


108-flll . 1104-115 

=110, and lor 188 ^ is =113. 

2 2 


The curve of the ten-yearly moving average (centred), when 
drawn through the historigram in figure 28, is far more smooth 



ANALYSIS OF TIME SFAIES 


sm 


than the index historigram or tho 5-yearly moving average 
curve. This curve shows the general rising tendency of 
wheat prices from 1873 to 1910. It, therefore, shows the 
long-period variations or the trend, and eliminates short-time 
oscillations. Consequently, the 10-yearly moving average 
(centred) column in table 50 gives us the trend values. 
We have thus measured the trend by eliminating short-time fluctua- 
tions through the process of smoothing. 

I'he moving average method is easy of application and 
enjoys an advantage over the freehand curve method in that 
different people will not obtain different results by using 
this method. But it has two limitations. 

(1) It does not enable the carrying out of the accurate 
trend to the extremes of the data. Trend values relating to 
the years from 1873 to 1877 and to those from 1906 to 1910 
('ould not be ascertained in our example. This deficiency 
may be made good in either of two ways: 

(a) The moving average curve may be carried out free- 
hand to each extreme. 

(/>) Artificial final groups may be formed by duplicat- 
ing the numbers at the extremes. In table 50, 
for instance, 170 might be added at the close five 
successive times, forming the required new groups 
for computing 10-yearly moving average upto 
1910. 

Both these methods are, however, approximations. 

(2) It cannot be applied with equal success to any and 
every historigram. It is useful only in those historigrams 
which manifest more or less periodicity, for the object of using 



M€x Numbers 


360 


statistics: theory and practice 


this method is to eliminate periodicity. If a historic ram does 
not show regnlar periodicity, the, period of the moving average 
to smooth out its irregularities would obviously be very long, 
and the moving average would show the general trend for the 
idiole period without allowing for any of the large variations 
which it might be proper to retain. 


Eetail prU‘-e of wheat in India shewing 
Annual fluduatiaus, five-yearly moving average^ and the frnid. 



Feriodictity and Cyclical fluctuations. 

The index historigrani in figure 28 is undoubted- 
ly irregular in its shape, but the ups and downs 
show remarkable regularity in their occurrenee. It 
will be observed that it shows recurring years of high 


ANALYSIS OF TIME SERIliS 


mi 


prices at intervals of about 10 years (1879, 1888, 1897 1908) 
and similarly recurring years of low prices (1876, 1885, 1894, 
1904). The ups and downs, therefore, repeat th-emselves in a 
cycle of nearly ten years. The curve then shows cyclical fluc- 
tuations or we may call them periodic variations. The elimi- 
nation of this cyclical character would leave us with the long- 
period trend. The merit of the method of mo\ang average lies 
in tbe\ fact that if the period of time used in calculating the 
moving average is approximately equal to the length of the 
cycle, the moving average would eliminate the cycle and show 
the trend. One period or One cycle is said to be completed 
when beginning with a peak the falling curve .reaches a mini- 
mum point and then rising again reaches the ne^ct peak. There- 
fore, the period for any cycle is expressed by the time-distance 
between successive peaks. When the average of these time- 
distances in any given series is taken, it gives the period of the 
cycle, for the whole i>ei'iod. In our series of wheat prices the 
time-distances for successive peaks are 9, 9, 11 years. The 
arithmetic average of these three time-distances is 10 years. 
Therefore, ten years is the period for the cyclical fluctuations 
Of retail price of wheat in India. That is, the periodicity 
is ten years. It is why, ten-yearly moving average smoothes 
the in-egularities of the historigram, and shows the trend or 
long period tendency of retail prices. 


The moving average method is of very great use in finding 
tlie trend of prices when price changes show a cyclical charac- 
ter and a trend. W-e take hypothetical examples to explain 
our meaning. Let us suppose that the price of an agricultural 
commodity rises and falls in the manner shown in column 2 of 
table 51. 



362 


statistics: THKOKV AXD I’RACTICK 


Table 51. Index Numbers of Prices. 


1 


1 ' ' 

4 

L 

1 

Year 

1 Index Nos 

1 SYearly 

! ~ 

^ Index Nos. j 

5- Yearly 
moving 

1 Deviation of 
indices in col. 

1 of prices 

f moving | 
average ' 

of prices 
(with trend)! 

average 
(with trend) 

1 4 from mov- 
ing average 

1891 

115 

i 

115 ! 



2 

120 


122 



3 

135 

125 , 

1;«) ; 

129 

+10 

4 

130 

125 

13« i 

131 

+ 5 

5 

125 

125 

133 1 

133 

0 

6 

115 

125 

125 

135 

— 10 

7 

120 

125 

132 1 

137 

— 5 

8 

135 

125 

149 1 

139 

-flO 

9 

130 

125 i 

140 ; 

141 

-j- 5 

J900 

123 

125 i 

143 

143 

0 

1 

115 

125 

135 1 

145 

—10 

2 

120 

125 , 

142 

147 

— 5 

3 

135 

125 1 

159 

149 

-f-10 

4 

130 

' 

150 



125 

i 

153 




If we draw a historij^rain from the data ^iven in column 


2 it would show that ups <iiid downs occur at an interval of 
every five years regularly. Next, we take five-yearly moving 
average as put down in column 3 and plot it over the histori- 
gram. We shall get a straight line without any slope, 
without any upward or downw^ird moveinenl. We then con- 
clude that the period of cycle (periodicity) is 5 years, the 
cycle exactly repeats itself, and the prices shown in column 
2 have no ‘tarend’. 

We take another example. If we draw a historigi-am 
from the data given in column 4 it would also show that prices 
vary in a cycle of five years. We take five-yearly moving 
average, as shown in column 5, and plot it over the histori- 
gram. We shall again get a straight line, hut this time the 
straight line would rise from left to right. We conclude that 
the period of the cycle is 5 years, the cycle exactly repeats 
itself, and the prices shown in column 4 have an upward 
‘trend’. 


ANALYSIS OF TIME SERIES 


368 


It should now be noted that in both of the above examples 
there are ‘ cyclical fluctuations but in the first there is no 
trend, while the ktter has an upward trend. 

The Smoothed Curve 

A smoothed curve shows the trend. Trend is 
the course that would be taken by a curve in the absence 
of disturbing factors. In the 10-yearly uiovino averap:e line ii» 
figure 28 all irregularities have disappeared, and we have obtain- 
ed the general rising trend of retail prices for the entire period. 
This smoothed curve is therefore of no use for studying short- 
time changes. We cannot study from it when prices began 
to rise or to fall. We cannot say that the retail price of wheat 
began to rise in 1887 because the smoothed curve begins to 
rise in that year. The average for 1887 is based on the prices 
of ten years, of which the year in question is only one. To 
study short-time (annual) changes, that is, to study when 
j)i‘ices l>egan to rise and to fall, we must consult the original 
historigram, and not the trend. If we study them from the 
trend, misleading conclusions might be drawn. For example, 
the shape of the smoothed curve for the period 1878 to 1880 
{figure 28) might lead one to think that the price of ''-heat 
was fairly steady during that period, whereas according to 
the oi-iginal data the indices 147, 158. and 118 show violent 
fluctuations. The smoothed curve, therefore, is good only 
for a studj^ of long-period general tendency of the pheno- 
monon under investigation 

Elimination of long-time variations. 

We arc very often interested in the study vi 
sliort-time oscillations of a time series. For this study 
we should get our data rid of long-time variations. 
To do so is quite easy when once the trend of the 
series has been known, tor the <lifference between the 
value of an item on a particular date and its corresponding 
trend value is the. short-time oscillation. Therefore, when a 
historigram is given, a satisfactory method of eliminating long- 



statistics; theory and practice 


m&i 

'4iii(Lei -varuitioiis would be to discover the trend and measure 
th*© - deviations: of the original data from the trend. These 
deviations may then be plotted on a horizontal base line. 

We have discovered the trend of our series relating to 
retail price of wheat in India. The vaules of the ten-yearly 
amoving average ip column 5, .table,. 50, are the trend values. 
To eliminate the trend we compute deviations of the prices 
from the trend values. These deviations are placed in column 
6 of the same table. We plot them on a graph in figure 29. 
The resulting curve gives only the short-time oscillations of 
retail price of wheat unobseiired by the long-time variations. 
Fluctuations in price without the trend can now l>e studied 
from the diagram very easily. 



ANALYSIS OF TIME SEEIES 


d«5 


We can similarly eliminate the trend from the indices 
given in column 4, table 51. Column 6 of the same table gives 
the deviations of indices from the 5-yearly moving average. 
These deviations are the short-time fluctuations and when 
plotted on a graph would show a regular rise and fall in short- 
lirne oscillations occurring every fifth year. The cyclical 
character of the fluctuations would thus l>e rendered clear. 

Thus, the primary task of studying and eliminating long- 
time and short-time changes in a time series is over. The 
short-time oscillations would, it may be observed, consist of 
seasonal, (cyclical and irregular flueutatioms which may also be 
separately measured. 

Measuring Seasonal Variations. 

If we know that fluctuations in a given series are strictly 
seasonal, we have a simple method of measuring them. Suppose 
we are studying seasonal variations in the export 
of raw jute from India for the period 1937-38 to 
1942-43. We would find that the oscillations due to ** boom ** 
created by the war which began in 1939 or to the lack of 
shipping space are serious hindrances in our study. To get at 
the strictly seasonal changes we should adopt the method of 
obtaining a seasonal average for the period under considera- 
tion. If monthly records only are available, the process of 
finding this average would be to add up the figures separately, 
for each month and divide the summation by the total number! 
of years. Table 52 shows this process for a few months which 
shall be followed for the remaining months. 


Table 62. Exports of Raw Jute from India in tons (000). 


Month 

Year 

. _ 1 

Total 

Average 

1937-38 

1938-39 

1939-40 




April 

71 

47 

53 

38 

20 

26 

255 

43 ’ 

May 

76 

47 

44 

36 

31 

*"7 

241 

40 

June 

63 

35 

34 

16 

37 

13 

198 

33 

July 

53 

43 

21 

7 

27 

28 

179 

30 







366 


statistics: theoHv and practice 


The averag:es placed in the last column of table 52 elearlj^ 
show the typical movement of exports of raw jute. If they 
are plotted on a graph, the part of the year in which the 
exports are the greatest and the part in which they are the 
least would become easily visible. This would give a clear 
idea of the seasonal fluctuations in the export of raw^ jute. 
And, if figures for a larger number of years are taken, the 
same seasonal fluctuations would be found to ^xist. 

Similarly, seasonal fluctuations in rainfall, temperature, 
production of a commodity, sales of certain goods, w'ithdraw^a) 
of bank deposits, unemployment etc. can be studied. 

The series relatiiig to prices of wheat (table 50) is expres- 
sed in index numbers. The methods of eliminating long-and 
short-time oscillations discussed above w^ould apply equally 
well to a seri*‘s expi*essed in the original from, ij\ a series not 
reduced to indices. Suppose we have a table giving daily 
temperatures in degrees Fahrenheit for a certain place for a 
month, and dt^ire to determine the trend. We may plot the 
temperatures on a graph, observe, the average tiine-distanee 
of the cycle of Huctuations. Suppose the wave-length is 
7 days. We may then openite on our series with seven-da} s 
moving average and get the desired trend. 

Comparison of Time Changes in two Historigrams. 

When cumparison of time changes in itvo historical 
variables is desired, they should first be reduced to index 
numbers so that their relative fluctuations may he easily com- 
pared. The two index series may then be presented in the 
form of historigrams, their trends discovered, deviations of 
the original items of the two series from their trend measured 
and plotted in one graph with the same base and scale. 
Comparison between the short-time oscillations of the two 
index series can then be made by studying the movements of 
the two curves. • 



ANALYSIS or TIMK SERIES 


367 


To compare long-time changes, the moving averages of the 
two series should be plotted on the same base with the same 
scale and the directions of the resulting moving average lines 
studied. It w^ould he better to draw^ the moving average 
curves on the same graph on which the histor»grams have been 
drawn. 


EXERCISES 


(1) 

Indieate brief! 

V how vou would analvze a series of monthly 

rt'cords 

extending over 

50 years. 


(M.A. 

Alld.. 19t2). 

(-M 

(a) Explain 

fullv what 

is meant by 

secular trend. 

seasonal 

variations, and eyelieal 

fluetuations, illustrating your 

answer. 






(b) 

Study the 

sliorl-time 

duet nations of 

the following 

leroperalures measured 

in degrees 

Fahrenheit : — 

’ 


Date remperature 


Date 

Temperature 


19H 



1911 


I' 

t‘b. 1 

10 

I't 

b. 1 1 

78 


2 

50 


12 

80 


: a 

H 


18 

60 


1 

70 


1 1 

61 


, »*) 

52 


15 

62 


(i 

It 


16 

68 


, 4 

an 


17 

86 


iS 

10 


18 

96 


9 

5() 


19 

91 


, 10 

(>S 


20 

78 





(B. Com. 

, Alia,, 1912). 

(3) 

('ornpare the 

long-time e 

lianges 

and the short-time osoil- 

Irdtions of the following data: 




Year 

Index No. Index No. 

Year 

Index No. 

Index No. 


JC 

V 


.r 

y 

1900 

80 

102 

1916 

100 

108 

1 

82 

101 

17 

101 

106 

2 

88 

100 

18 

102 

112 

a 

85 

107 

19 

106 

111 



368 statistics: theory- and practice 


Year 

Index , J^o. 

.Index Na»^. 

Year 

Index No. 

Index No. 


x 

y 


X 

y 

4 

90 

no 

20 

102 

110 

6 

86 

108 

21 

101 

109 

6 

84 

106 

22 

100 

108 

7 

82 

104 

23 

98 

108 

8 

80 

103 

24 

103 

113 

9 

95 

104 

25 

101 

112 

10 

90 

112 

26 

99 

111 

11 

88 

108 

27 

98 

108 

12 

87 

103 

28 

93 

108 

13 

87 

104 

29 

90 

115 

14 

100 

109 

30 

102 

107 

15 

100 

102 

31 

100 

102 


(4) (a) How would you distinguish the cyclical fluctuations 
from the trend and the seasonal fluctuations? 

(h) The following table gives the value of the t xporls of 
merchandise from India during the years 1919-20 to 1923-24. 
Calculate the seasonal variations for each month during this period. 


In Crores of Rupees 

Months 1919-20 1920-21 1921-22 1922-23 1923-21 


April 

. . 

20 

27 

17 

23 

29 

May 


20 

26 

18 

26 

28 

June 


19 

21 

15 

18 

29 

July 


26 

19 

17 

23 

25 

August 


25 

19 

18 

24 

22 

September 


30 

21 

19 

20 

23 

October 


28 

19 

17 

21 

25 

November 


29 

17 

19 

27 

26 

December 

^ . 

26 

18 

21 

26 

30 

J anuary 


29 

18 

22 

28 

36 

February 

. . 

26 

17 

21 

30 

35 

March 

, . 

30 

18 

26 

31 

40 


(M. A. Econ., Alld., 1937). 

i 

(5) The following table gives the Bank Clearings in the 
Bombay City for the years 1916 to 1940 in millions of rupees. 
Find the trend, and) verify your result graphically. 



ANAI.Y8IS OF TIME SF>EIES 369 


1910 

52.7 

1929 

94.6 

1917 

79.4 

1930 

83.0 

1918 

76.3 

1931 

110.6 

1919 

66.0 

1932 

159.6 

1920 

68.5 

1933 

177.4 

1921 

93.8 

1934 

.. 178.6 

1922 

104.7 

1935 

. . 235.8 

1923 

87.2 

1936 

. . 243.2 

1924 

79.3 

1937 

194.4 

1925 

. . 103.6 

1938 

. . 217.9 

1926 

97.3 

1939 

214.0 

1927 

92,4 

1940 

. . 256.7 

1928 

100.7 




(13. Com.. Alld.^ 1943). 

(6) Classify the different types of fluctuations which occur 
in the analysis of time-series. Illustrate your remark with the 
help of the following series: 


6099 

6497 

6898 

7300 

7699 

6223 

6621 

7024 

7421 

7828 

6351 

[ 6764 

7152 

7553 

1 "949 

6477 

i 6878 

7275 

7675 

1 8077 


(M.A., Cal.. 1937). 


(7) Explain the use of moving averages in the analysis of 
time series. Find out approximate moving average for Uie 
following series : — , 


1901 

506 

1906 

696 

1911 

1189 

1916 

898 

1902 

620 

1907 

1116 

1912 

818 

1917 

814 

1903 

1036 

1908 

738 

1913 

745 

1918 

929 

1904 

673 

1909 

663 

1914 

845 

1919 

1360 

1905 

588 

1910 

777 

1915 

1276 

1920 

1921 

961 

926 


(M.A., Cal., 1936). 

(8) Write a note on the statistical analysis of time-series in 
economic studies. Illustrate vour remarks with the help of the 

F— 24 




370 


statistics: thkory and practick 


following table, using in particular Ji-year a 

nd 5-year moving 

averages : — 





Year 

Value 

Year Value Year Value 

Year Value 

3901 

507 

190S 552 

1915 583 

1923 628 

1902 

522 

1909 556 

1916 581 

1924 632 

1903 

524 

1910 548 

1917 599 

1925 626 

1904 

521 

1911 572 

1918 602 

1926 ' 644 

1905 

538 

1912 569 

1919 597 

1927 643 

1906 

541 

1913 567 

1920 612 

1928 642 

1907 

537 

1914 587 

1921 616 

1929 661 




1922 608 

1930 6,>9 




(M.A., Cal., 1935). 

(9) 1) 

raw a graph the 

folJowina; time-series and study its 

trend 





Year 

Value 

Year 

Value 

1910 

496 

1920 

1 442 


11 

615 

21 

.. 1617 


12 

686 

22 

. . 1678 


13 

835 

23 

1791 


14 

888 

24 

. . 1916 


15 

1081 

25 

.. 1883 


16 

1132 

26 

. . 2064 


17 

. 1139 

27 

. . 2278 


18 

. 1320 

28 

. . 2368 


19 

. 1389 

29 

. . 2345 




(H 

( om.. Val., 1937). 

(10) Plot the 

following Index Numb<"rs 

of wholesale prices 

in U. S. A., 

and sho 

w the general trend of prices: — 


Year 


Inde 

X Number 




oi 

Prices 




(1910 — 14= 100) 


1800 . . . . 129 

1810 ... 131 

1820 . . . . 106 

1880 . . . . 91 

1840 . . . . 96 



ANALYSIS OF TIME SERIES 


371 


Year Index Number 

of Prices 
(1910—14=100) 


1850 

• . 

• . 

84 

1860 

. . 


93 

1870 



135 

1880 

. . 


100 

1890 

. • 


82 

1900 

. . 


82 

1910 

. . 


103 

1920 



226 

1930 



126 


1 ) 


(B. Com., Alld., 1935). 


(11) Business Cycles in the U. S. A., and England arranged 
in chronological order (1796 — 1923) have had the following 
duration as measured to the nearest year: — 


V. S. A.— 

6, 6, 5. 3, 7, 3, 3, 5, I, 3, 6, 1, 2, 6, 4, 3, 5 , 5 , 4, 9, 5 , 3 , 
2, 3, 4. 3. 4, 2, 3. 5. 2, 3. 


England — 

4, 6, 4, 3. 5. 4, t), 4. 2, 6, 10. 7, 4, 8, 8, 9, 8, 10, 7, 6, 

5, 2. 


Tabulate the above figures in classes of one year each and 
calculate the average duration of the business cycle in each country 
separately . , 


(B. Com.. Luck.. 1939). 

(12) What is meant by ‘ trend How would you statistically 
eliminate the influence of seasonal and cyclic factors on the long 
period movement of any series? 


(B. Com., Bombay, 1936). 



372 


statistics: theory and practice 


(IS) Do the following figures indicate a definite “ period '' 


or ** trend 
answer. 

or are 

they random Graphically 

illustrate y 

Year 

Value 

Year 

Value 

Year 

Value 

1900 

. . 24 

1911 

67 

1921 

131 

J 

. . 26 

12 

76 

22 

136 

2 

. . 27 

13 

76 

23 

140 

S 

28 

14 

84 

24 

142 

4 

30 

16 

86 

26 

145 

5 

. . S5 

16 

100 

26 

148 

6 

. . 41 

17 

. . 113 

27 

150 

7 

. . 43 

18 

. . 128 

28 

158 

8 

. . 48 

19 

. . 121 

29 

162 

9 

. . 53 

20 

. . 129 

30 

170 

10 

. . 63 

, , 

• • • • 


• • • t 


(14) Plot the following figures on a graph paper and study 
their trend. On a separate graph paper show their short-time 
oscillations with the trend removed. 


Year 

Value 

Year 

Value 

1913-14 

. . 264 

1924-25 

306 

-15 

. . 265 

-26 . . 

303 

-16 

. . 267 

-27 

306 

-17 

. . 267 

-28 . . 

297 

-18 

. . 269 

-29 . . 

292 

-19 

. . 264 

-30 . . 

304 

-20 

. . 263 

-31 . . 

310 

-21 

. . 265 

-32 . . 

317 

-22 

. . 271 

-33 . . 

331 

-23 

. . 289 

-34 . . 

344 

-24 

. . 310 

* • • • 



(16) Following are the total deposits of all exchange banks 



ANALYSIS OF TIME SERIES 


373 


in India in crores of rupees. Calculate five-year Lnd nine-year 
moving averages and show them graphically. 


Year 

Deposits 

1915 

34 

16 

38 

17 

53 

18 

62 

19 

74 

20 

75 

21 

75 

22 

73 

23 

68 

24 

71 


Year 

Deposits 

1925 

71 

26 

72 

27 

69 

28 

71 

29 

67 

30 

68 

31 

68 

32 

73 

33 

71 

34 

71 

35 

76 



CHAPTER XVn 


CORRELATION 

Black cats cause bad luck while filled-up pitchers good 
fortune — ^these are the beliefs held by some people. But these 
beliefs are incapable of being justified by mathematical theory. 
It is, therefore, difficult to! say if there really exists any rela- 
tionship between black cats and bad luck and between filled-up 
pitchers and good fortune, though occasional coincidences may 
suggest such notions. On the other hand, some people believe 
that devaluation of the rupee from Is. 6d. rate to Is. 4d. 
would stimulate India’s export trade, or that a rise in the 
rate of interest would encourage savings. These impressions 
do indicate some sort of relationship, but they are mere guesses 
until they have been tested by the mathematical theory of draw- 
ing conclusions. The theory by means of w^hich quantitative 
connections between two sets of phenbmena are deteimined is 
called the Theory of Correlation. 

Meaning' of Correlation. 

Correlation means a possible connection, relationship or 
interdependence between two sets of phenomena. If in each 
of them some factor is numerically measured and it is dis- 
covered that changes in the size of one factor run in sympathy 
with changes in the size of the other, or to say the same thing, 
large values of one go with large values of the ocher and 
small with small, or vice versa, the two factors exhibit some 
mutual dependence which is termed oOTrelation. In other 
words, if two quantities vary in sympathy so that a movement 
— an increase or decrease — ^in the one tends to be accompanied 
by a movement in the same or inverse direction in the other, 

374 



CORREI^TION 


375 


and the greater the volume of change in the one the greater 
is the volume of change in the other, the quantities are said 
to be correlated. 

In natural sciences correlation can be reduced to absolute 
mathematical terms. Heat always increases with light and 
an electric current is always associated with magnetic field. 
These instances suggest a high degree of correlation. But in 
social sciences it is seldom that any absolutely fixed mathemati- 
cal relationship between two variables can be established. 
The law of demand, the law of diminishing returns, Gresham's 
law, to take a few illustrations, suggest correlation, but this 
correlation is not so perfect as that in the natural sciences. 
Therefore, in inexact sciences we must take the fact of 
correlation .established, if in a large number of cases two 
variables always tend to move in the same or opposite 
directions. 

Such pheaiomena are not uncommon in the social and 
economic sphere. We very often see that demand for a com- 
modity generally falls with a rise in its price, that price level 
ill a country generally rises with supply of money, that tall 
fathers generally have tall sons, that young husbands 
generally have young wives, that a taller man generally tends 
to be thinner. In all these cases correlation exists. 

Positive and Negative Correlation. 

Goirelation may be positive or negative. If the two 
given variables steadily deviate in the same direction, 
correlation is direct or positive; but if they constantly deviate 
in the opposite directions, correlation is inverse or negative. 
That is, if an increase (or decrease) in the values of one 
variable is associated with an increase (or decrease) in the 
values of the other, eorrelation between them is positive. And, 
if an increase (or decrease) in the values of one variable Is 
associated with a decrease (or inciwse) in the values of the 
other, correlation between them is negative. One way of 
detecting the positive and negative character of correlation is to 



376 


statistics: theory axd practice 


plot the two related variables on a graph paper, that is, draw 
correlation graphs, and read the direction of the two curves. 
If they run parallel throughout (as they do in figure 30), 
correlation is direct; but, if they run in opposite directions, 
correlation is inverse. If general level of prices rises with 
increase in the amount of mone^^ in circulation, correlation 
between money in circulation and prices is positive, If with 
an increase in the production of sugar in India the imports 
of sugar have gone down, the correlation between production 
and imports of sugar is inverse. 

Degree of Correlation. 

Correlation exists in various degrees. The radius of a 
circle bears a perfectly definite relationship with its area, so 
that the jrea increases in a perfectly definite proportion with 
an increase in the radius. Similarly, the area of a square 
increases in a definite ratio with an increase in the length of 
its side. These are the instances where correlation is perfect 
and positive. Correlation will be perfectly negativi., if a 
fall of 10 per cent in the price of a commodity results in 10 
per cent rise in its demand. Similarly, there may be instances 
where no correlation may exist. If the height of a house is 
compared with that of a growing tree over a period of lime, 
it may be found that while the height of the house remained 
unchanged during the period, that of the tree not only in- 
creased but also crossed that of the house. Evidently, the 
height of the, house cannot be associated with that of the 
tree and, therefore, no correlation exists between them. 
Correlation may exist in a limited degree. If demand for a 
commodity increases, its price also increases, but not necessari- 
ly in th-e same proportion. This is a case of limited positive 
correlation. If area under food crops in a country increases, 
that under non-food crops may fall but not necessarily in the 
same proportion. This is an example of limited negative 
correlation. 



CORREI^TION 


877 


Thus, coirelation is perfect poeitiv'e if an increase (or 
decrease) in one variable* is always followed by a correspond- 
ing and proportional increase (or decrease) in the other 
related variable. It is perfect neigative if an increase { >r 
decrease) in one factor is followed by a corresponding and 
proportional decrease (or increase) in the other factor. There 
is no oorrelatiiOQi at all if values in one variable cannot be 
associated with values in the other variable. In between 
perfect positive correlation and no correlation there may be 
limited degrees of positive correlation. Similarly, in between 
no correlation and perfect negative correlation there may be 
limited degrees of negative correlation. 

Then, we may construct a scale which begins at the top 
with perfectly positive correlation, passes through limited 
degrees of positive correlation, reaches and crosses the entire 
absence of correlation, and passing through limited degrees 
of negative correlation ends at i>erfectly negative correlation. 
Such a Scale is provided by Coefficient of Correlation. 

Coefficient of Correlation 

Coefficient of correlation is the numeri cal m easure, of 
the amount of correlation existing b^ween two variables, sub- 
ject and relative. That variable which is used as the standard 
is called the subject, and the variable which is compared with 
the subject or measured in terms of the subject is called the 
relative. Generally, Karl Pearson’s coefficient of correlation 
is used. This coefficient varies between -fl and—1. When 
the coefficient reaches unity it is assumed to be perfect. 
Perfect positive correlation is indicated by +1, perfect nega- 
tive by —1, no correlation or complete independence by O, and 
limited correlation l)y the intermediate values of the coefficient. 

Study of Ck>rrelatioxi. 

Correlation may be studied between (1) two related 
historical variables and (2) between any other two groups of 



378 


statistics: theory and practice 


related phenomena. CoriTlation may, for instance, be studied 
between output of sugar in India and imports of it over a 
l>eriod of time to find whether wnth the increment in output 
in the country imports have fallen. It may be studied between 
supply of a commodity and its price over a period of time to 
find whether price falls with increase in supply. These are 
examples of historical variables. Correlation may be studied 
between the length and breadth of the leaves of a certain 
tree to find the relation between their length and breadth. It 
may be studied between stature of fathers and stature of 
sons to find if tall fathers generally have tall sons. These are 
all examples of related phenomena. If, however one produces 
figures to show that as the production of cane-sugar increased 
in India that of motor cars fell in the U.S.A. over a period of 
time, or as the length of X leaf increased the breadth of Y 
leaf decreased. These instances would not imply correlation, 
unless there is reason to believe that production of cane sugar 
in India and of motor cars in the U.S.A, are related in some 
way, or the length of X leaf and breadth of Y leaf are groups 
of related phenomena. 


Karl Pearson’s Coefficient of Correlation. 

To determine the degree of correlation between two re- 
lated, variables the coefficient of correlation devised by Karl 
Pearson, the great biologist and statistician, is the most satis- 
factory. This coefficient is calculated by dividing the product 
of all the deviations e>i each pair of observations from their 
respective means by the product of the standard deviations of 
the two variab les and the number o f items . Thus, if ajj, %% 
etc., be the deviations of the values of the first variable, the 
subject, from the arithmetic aye^ge, and yj, y2, yz etc., be the 



CORRKIjVTION 


379 


correspondizig deviations of the values of th<e second variable, 
the relative, and the summation of the products of Xi with 
of X2 with y2f of with ya, and so on be represented by 
Xxy, and further the standard deviation of the subject be ai 
and of the relative <72, and n be equal to the number of pairs of 
observations, then r, Karl Pearson’s coefficient of correlation, 
will be 

Xxy 
n C 7 i a-z 

When Xxy is positive, correlation will be iK)sitive; when 
Xxy is negative, correlation is negative. It is the numerator i 
which largely regulates the size of the coefficient. If posi- 
tive items in one series are associated with positive items in 
the other series, or if negative items in one series are associated 
with negative items in the other series, the coefficient of corre- 
lation is positive. This means that if items larger than the 
arithmetic average in the subject are associated with items 
larger than the arithmetic average in the relative, or items 
smaller than the mean in the subject are associated with items 
smaller than the mean in the relative, the correlation coefficient 
will be positive. If, however, positive items in one series are 
associated with negative items in the other or vice versa, the 
correlation co-efficient will be negative. When positive and 
negative deviations in tlie two series are indifferently asso- 
ciated, correlation will tend to zero, and will reach that limit 
when the negative products of deviations will be equal to the 
positive ])roducts of deviations, i,e. when Xxy will be zero. 

Calculation of Pcarsonian Coefficient of Correlation. 

Direct Method 

Example 1 . Required to calculate coefficient of correla- 
tion between ages of husband and wife in a given communito^ 
at a certain time. 



380 


statistics: theory and PRAC^nCE 


Tabl« 53. Calculation of PearsonUtn Coefficient of Correia* 


tion between ages of husband and wife. 


Subject 

X 

Relative 

Y 

Product of 
deviations of 
husband ^s 
age and of 
wife^s age 

xy 

Age of 
husband 
(Years) 

m^ 

Devia- 
tion from 
average 
(25 yrs.) 

X 

Square of 
deviation 

Age of 
wife 
(Years) 

Devia- 
tion from 
average 
(18 yrs.) 

y 

Square of 
deviation 

19 

—6 

36 

14 

—4 

16 

+24 

21 

— •! 

16 

16 

—2 

4 

+ 8 

22 

—3 

9 

15 

—3 

9 

+ 9 

23 

2 

4 

14 

—4 

16 

-f 8 

23 

2 

4 

17 

—1 

1 

+ 2 

24 

—1 

1 

14 

—4 

16 

+ 4 

24 

—1 

1 

17 

—1 

1 

+ 1 

25 

0 

0 

18 

0 

0 

0 

26 

+1 

1 1 

17 

-1 

1 

1 — 1 

26 

+.1 1 

1 

1 20 

1 +2 

4 

+ 

27 

+2 

4 

21 

+3 

9 

+ 8 

28 

+3 ! 

9 

20 

+2 

4 

+ 8 

28 

+3 : 

9 

22 

4-4 

16 

+12 

29 

-4-4 i 

16 

22 

-f4 

16 

+18 

30 

+5 i 

25 1 

23 j 

45 

25 

+23 

V7n,=::37r» 



vw^270 



ixk 

^a.= 25 

1 


18' 


vy^l38 

2x3f=+122 


n. or number of pairs of observations =15 


Standard deviation is determined by the formulas 

, In table 53, the t/’s in the X series are called x's; 
n 

those in Y series, y^a. Accordingly, the formula for the X 

series is y ; for the Y series, , 

'a ^ n 


. . \ y^= 3.01 yeara. 


and years. 



n <Ti <F2 


*See Table 23, Chapter XI for computing standard deviation. 





CORRELATION 


881 


+ 122 

S3 ==^4. on 

15X3.01X3..03 

+ .89 indicates a very high degree of positive correlation, 
implying that the age of wife increases with that of husband. 

Short-cut Method 

In the above example the averages of the ages of husbands 
and wives happen to be whole numbers. Therefore, the calcu- 
lation of the deviations of ages from the mean, their squaring 
up, and their multiplication did not involve any trouble. If, 
however, the averages contain a fraction, these calculations 
would involve much labour, to do away with which the short- 
cut method may be used. In using it, any whole number may 
be assumed as the average, deviations from it calculated, and 
squared, and the standard deviations computed according to 
the short-cut method of computing the standard deviation. 
The deviations of the two series may be multiplied and 

^ xy' 

summated. The resulting should later be corrected 

Th 

by subtracting from it the product of the dilferences between 
the true means and the assumed means of the two series 
Thus, if p be the true average of the products, i.c. true or 

corrected value of , then 

n 

where 0^ stands for the true average and x^ for assumed 
average of the first series, and 02 stands for the tnie average 
and X2 for the assumed average of the second series, and Xxy 
is the summation of the products of deviations from assumed 
means. 

Then, the coeffi'cient of coiTclation, or, ^ 

cri<T2 

The above two processes may also be combined into one 
formula, so that without changing what the symbols stand for, 

— 


n CTi 0-2 



882 


S^rATlSTICS: theokv and vkactick 


Example 2. Required to calculate the coefficient of 
correlation between birth-rate and death-rate of a few' 
countries of the world during ]931, usin^ the short-cut method. 


Table 54. Calculating the Pearsonian Coefficient o) Correla- 
tion between birth rate and death rate for a feu: 
countries of the uorld for 1931, 


Country 


Birth 

rate 


1 


1 


Egypt 

44 

Canada 

24 

U, S, A. 


India 

33 

Japan 

32 

Germany 

Ifi 

Franc© 

18 

I. P. State 

20 

U. K. 

10 

U. S. S. R, 

40 

Australia 

20 

Newzealanc 

18 

Palestine i 

53 

Sweden 

15 

Norway 

17 


n=:15 


! vw,= 385 
i a.=25.6r 


£ 

o iTT 

^ c 


from 

ed 

ift 

w/ 

ci 

® 1 

a* cs 

-3^ C 

i.i i 

1.S S I : 

CS ' 

C/ 

s.g 

3 > 

CC ^ 

Deatli 

rate 

Deviation 

assumi 

bt) 

si 

> 

as 

a» ‘C 

H 

® ^ C8 
^ tM 

2 o'® 

'gS'i 
S-S “ 

^ 1 

a-* 

TMs 

y 



ar.v 

+1S 

324 

27 

4-1 

2 

144 

4-216 

o 

4 

11 

— 

4 

16 

+ « 

' 7 

40 

12 

— 

3 

9 

4- 21 

= 4- " 

49 

24 

+ 

9 

SI 

4- 63 


36 

19 ' 

4- 

4 

10 

T 24 

, —10 ‘ 

100 

11 

— 

4 

16 

4- 40 

18 ' 

64 

16 

4- 

1 

1 

— 8 

! - 6 ' 

. 36 

14 

— 

1 

1 

4- « 

’ —10 1 

100 

12 


3 

9 

4- 30 

; 4-14 ; 

196 

18 

4- 

3 

9 

4- 42 

i — () 

36 

9 

— • 

0 

36 

; 4- 36 

: — 8 i 

64 

8 

— 

7 

49 

i 4- 56 

: 4-27 ! 

729 

23 

+ 

8 

64 

4-216 

• —11 

121 

: .12 


3 

9 

. 4- 33 

; - 0 1 

81 

11 

i “ 
i 

4 

16 

4- 36 

j j 

=1989 

!^tn=: 227 
: a5=15.13 

i 


Vv==47t) 

1 

j 4-819 


n, or number of pairs of observations =15. 

fli or True arithmetic average, for the first series 


Xm, 

n 


385 

15 


= 25.67 




COHRKI^TIOX 


388 


= 15.13 


Lei X,, or As.sumed average, for the first series =26 
a-j or True arithuietie average, for the second series 

227 

~ n ~ 15 

Let X 2 , or Assumed average, for the second series=15 
Standard deviation, using the short-cut method, is determined 

l)y the foriimla-^, table 34. in 

ihe first series arc called a's; those in the second .series ys 


Accordingly, the formula for the first series is^l5f_” 
;ind for the second series,'^' 

198b-15 (25.67-26»- 

. . or, — \ T? ^11..-) 


‘iXy- — n ia-j — x-^)' 


15 


. ^ 176 - 15 ( 1 . 5 . 13 - 151 - 

and < 7 --\ =.->.612 

n\ (o, — .V, 1 (f/j — .»•>'] 

r= 

n or, a -2 

_ -4 819-15 ( -..33Xd,3j^ 

“ I'sxnlsvsx)!^ 


-= 4 .818. 

r.848 denotes a very high degree of positive eorrelation 
between birth-rate and death-rate of the given countries of 
the world. 


Co>efflcient of Oorrelation for Loug-Time Changes. 

In the above two examples the variations in the items relate 

-Soo Tablr 25, Chapter XI, lor The Short-cut Mothotl, 



384 


statistics : theory and practice 


to a specific time. Correlation may also be studied for histo- 
rical data^ that is, data stretched over a period of time. 
Historical data may relate to (i) Long-time changes and (ii) 

I Short-time oscillations. In computing the co-offiy?ient of corre- 
/ lation for long-time changes, the method used in example 1, or 
I if need be, used. in example 2, shall be followed throughout, 
the items and deviations* from the mean for the same date 
being paired together. In computing the co-efficient of corre- 
lation for short-time oscillations this method will be modified. 

Pearson’s Modified Co-efficient for use with Short-Kme 
Oscillations. 

It is possible that the short-time changes in two variables 
may be in opi30site directions while the long-term changes may 
be in the same direction. Then, if co-efficient of correlation 
of such variables is computed by the method used in the fore- 
going two examples, a large positive co-efficient would result 
whicl: would not take any account of the opposite direction 
of the short-time oscillations. Correlation Ci» efficient comput- 
ed from actual items would consequently be misleading. We 
should, therefore, be concerned with short-time oscillations only 
and rid our data of the long-time variations.^ To do it we 
should discover the trend and eliminate it by computing the 
deviations of original items from the trend. These deviations 
should be multiplied together to yield Xxy. And these devia- 
tions, again, should be squared up to compute standard devia- 
tions. Thus, the modification made in the original formula is 
that deviations of the items are taken from the trend instead 
of from Hie arithmetic average. Example 3 demonstrates the 
working of this method. 

Example 3. Required to compute the co-efficient of corre- 
lation of the short-time oscillations for indices of supply and 
p^ice of a certain commodity. 

*See Chapte' XVI for ^Elimination of long time variations.’ 



COHRELATIOX 


385 


Table 55. Computing Co-efficient of Correla^n of Short-time 
Oscillations between Supply and Price. 



[Five-yearly cycle has been assumed in the above series 
and de<;imals have been ignored in computing the moving 
average. Greater precision could be achieved by carrying out 
the decimals.] 

n or number of pairs of observations =11, since only the 
years 1922 to 1932 can be used in computing the co-efficient. 

ai = \i— = \/^= 1/14 =3.742 

71> A X 

a,=\/^*= \/^= v^54.6= 7.389 

F.— 25 



386 


statistics: theory and practice 


Xxy —285 

fi 5s: — ~ — 037 

na^a^ 11X3.742 X 7.389 

— .937 denotes a very high degree of inverse correlation 
between supply and price, indicating that as supply increases 
price falls and vice versa 


Calcolatioii of Correlation Co-efficient in Oronped Series. 

In the foregoing three examples the given series relate to 
quantitative individual observations. Correlation of grouped 
series can also be similarly studied. We may measure an ade- 
quate number of pairs of values for each member and find 
what values are associated together and how often the same 
values are repeated. When this is- done, we can group our 
data into a table of double entry, or contingency table. Sup- 
pose we find that in two class-tests — one in Economics and 
the other in Geography — at which 60 boys were examined the 
following were the results:— 

Table 56. Frequency distribution of marks in Economics & 
Geography. 





X:ORRELATION 


.387 


If we desire to study the relationship between the know- 
ledge of Economics and that of Geography with the help of 
the above two series, we would need some more information : 
We should know what values of the two series are associated 
together and how frequently the same values are repeated. 
Suppose we find that one boy who got marks varying be- 
tween 5 — 15 in Economics also, got marks varying between 
0 — 10 in Geography, that three boys who got marks varying 
between 5 — 15 in Economics also got marks varying between 
10 — 20 in Geography, and so on, we can prepare a table of 
double entry as follows: — 


Table 57. Correlation Table for Marks in Economics and 
Geography, 


Y 

Maik» in 


X 

Marks in Economics 
(max. marks. 50) 


Total 

Geography j 
f max. marks 50 ) | 5 — 15 

15—25 

25—35 

i 35—45 

j 

fy 

- 0—10 

i ' 

1 

i 

! I 

2 

10—20 

3 

6 

1 

I 5 

1 

, 1 

15 

20—30 

1 > 

1 

8 


2 

20 

30—40 


3 

: 9 ! 

3 

15 

40—50 



4 

4 

8 

Toatl 

5 

18 

27 

10 

60 


Table 57 shows the grouped frequency distribution of two 
variables. This distribution may be termed as Bi vagiate Ere - 




388 


statistics: theoey and practice 


quency Distribution, and the table as Contingency table. But 
if we are particularly interested in the relationship between 
the two variables this table of double entry may be designated 

as Correlation ^Table. 

Example 4. Required to compute correlation co-efiScient 
from the data given in table 57. 

[To compute the eo-efiSeient the formula used in example 1, 
where deviations were calculated from the true mean, or the 
formula used in exercise 2, where deviations were calculated 
from the assumed mean, may be used in this example too. 
The latter procedure, saves much labour and, therefore, it will 
be adopted in the given ease.] 

In tables 58 and 59 we calculate the standard deviatio: 
of the X and the Y series, relating to marks in Economics 
and Geography, respectively. Let the assumed averages, 
and X 2 , for the X and the Y series be respectively 30 and 25. 


Table 68. Calculation of Standard Deviation of X series. 


Marks 

group 

Mid-value 

Frequency 

Product of 
mid-value & 
frequency 

Deviation 

from 

assumed 

average 

Square of 
deviation 

Product of 
frequency & 
square of 
deviation 




mf 

(30) 




m 

/ 

dx 

cPx 

/dP X 

(1) 

(2) 

(3) 

(4) 

(5) 

(6) 

(7) 

5—15 

10 

5 

50 

—20 

400 

2000 

15—25 

20 

18 

360 

—10 

100 

1800 

25—35 

30 

27 

810 

0 

0 

0 

3^-45 

40 

10 

400 

+10 

100 

1000 



0 

1 

2fn=1620 



x=4800 




COKRELATION 


389 


xi or Assumed average =30 marks. 

Xm 16^ 

fli or True average= = ^"—27 marks. 

n 60 


j£d'‘‘x-n (oi-xi)^ 


_ 14800 - 60 (27-30)^ 
60 

5=^/71=8.485 marks. 


Table 69. Calculation of Standard Deviation of Y series. 


Marks 

group 

(1) 

Mid-value 

m 

(2) 

Frequency 

f 

(•^) 

Product of 
mid>value & 
frequency 

mf 

(4) 

Deviation 

from 

assumed 

mean 

(25) 

(5) 

Square 

of 

deviation 

dv 

(6) 

Product of 
frequency 6i 
square of 
deviation 

/(Py 

(7) 

0—10 

5 

2 

10 

—20 

400 

800 

10—20 

15 

15 

225 

—10 

10 ’ 

1500 

20—30 

25 

20 

500 

0 

0 

0 

30—40 

35 

15 

525 

+10 

100 

1500 

40—50 

45 

8 

360 

+20 

400 

3200 



71=60 

^m=:1620 

i 

! 


2(pj=rc30 


Xt or Assumed Average =25 marks. 

„ . Sm 1620 

a* or True Average= — = ——=27 marks. 

n 60 



390 . gTATISTICS: THEORY AND PRACTICE 


^IXd^y—n ( 02 — * 2 )® 

_ 17000-60 (27 - 25)* 

' 60 

:^T2.66= 10.614 marks. 

Now, the value of %xy remains to be determined. 
Table 60 shows the method of determining it. dx > deviations 
from the assumed mean in X series shown in column (5), table 
58, and rfy, , deviations from the assumed mean in Y series 
shown in column (5), table 59, are taken to table 60. Related 
pairs of deviations are first multiplied and the product put 
down in the left-hand corner of their respective squares. 
Thus, —20, rfx , is multiplied with —20, dy , and the product 
placed in the left corner of the square formed by the 1st row 
and the 1st column; —20, rf* and —10, dy are multiplied and 
the product placed in the left corner of the square formed by 
the 2nd row and the 1st column; —10, and +10, dy are 
multiplied and the product, —100, placed in the left corner 
of the square formed by 4th row and 2nd column ; and so on. 

These products of d^ and dy are multiplied by their 
respective frequencies placed in the centre of their respec- 
tive squares. The final products are placed in the right corner 
of their respective squares. These final products, when 
summated give the Xxy. This summation refers to the as- 
sumed averages and will, therefore, as in example 2, be correct- 
ed by subtracting from it n times the product of the difference 
between true and assumed means of the two series. 



COBRELATION 391 

Table ^>0. Calculation of Summation of Products of DeviationSf 
(Sxy). • : 



Column No. 

1 

2 

3 

4 


Bow 

No. 

Marks-.>X 

5—15 

15—25 

25—35 

35-45 

Product of 
d'x and dy 
and fre- 
quency 

4 

dr 

-20 

—10 

0 

+10 

dy 

>1^ 




400 

200 




1 

0—10 

-20 

1 

1 

— 

— 





400 

200 



600 




200 

100 

0 

—100 


2 

0 

1 

to 

o 

—10 

3 

6 

5 

1 





600 

600 

0 

—100 

1100 




0 

0 

0 

0 

i 

3 

20-30 

0 

1 

8 

9 

2 

i 




0 1 

0 

0 

0 

0 

1 

1 


i 

1 

—100 

0 

100 


4 i 

1 

o 

1 

o 

CO 

+10 

— 

1 3 

9 

3 

j 


1 

1 


i 

1 


—300 

0 

300 

0 



: 

I 

1 

[ 

0 

200 


5 

40-50 

+20 

— 

— 

4 

i 

4 







0 

800 

800 


1 

Product of clx and I 



“ 



d V and frequency | 






fxy 

1000 

500 

0 

1000 

;ga?y=2500 












392 statistics: theory and practice 

^_ 'Zxy—n[(a^—Xi) (az— Xz)] 
n <xx <t2 

_2500- 60[(27-30) (27 - 25)] 

60X10.614X8.485 
^ 5403.^7 
2860 
= +.53 

+.53 indicates a moderately high degree of positive 
correlation between Economics and Geography. 

In the above example, it was assumed that the values of 
the various frequencies in the X and the Y series were equal 
to the mid-values of their class-intervals. Accordingly, the 
deviations, standard deviations and products of deviations 
had reference to the mid-values of marks-groups. No doubt, 
a particular class-interval includes all values between its class 
limits, but the assumption We have made does not generally 
create a large difference in the result, and is usually adopted. 

In examples 1, 2 and 4 correlation is positive but not per- 
fect. A simple example of perfect positive correlation is the 
following : — 

Number of persons : 1, 2, 3, 4, 5, 6, 7, 8. 

Number of eyes : 2, 4, 6, 8, 10, 12, 14, 16. 

Correlation co-efficient in the above series will be +1. 

AssumptiKms of Fearsonian Correlation. 

Karl Pearson’s co-efficient of correlation is based on two 
assumptions : 

(1) In each of the series correlated a large variety of inde- 
pendent causes are operating so as to produce norrruU distribu- 
tion. Such causes, for example, are variations in climate, 
nourishment, physical training, environment. The series 
resulting from the effect of such independent contributory 
causation would ^show normal distribution. Such causes were. 



COEEELATION 


893 


for instance, operating in the determination of ages of hus- 
bands and wives in example 1, 

(2) The forces so operating are related in a causal loay. 
If the forces are independent of each other, there would be no 
correlation. If the height of a house remained unaltered 
while that of a growing child increased, there would be no 
correlation between them, since the causes affecting one vari- 
able would not be found to affect the other, that is, the sizes 
in one could not be said to be associated with the sizes in the 
other. 

Charaeteristics of Pearsonian Oo-effident. 

Karl Pearson’s co-efiScient of correlation is zero when 
independence between two variables is complete and is unity 
when there is perfect correlation, i.e., when the connection 
between variables is rigid. It always varies between +1 and 
— 1 and is a sensitive measure of the amount of correlation. 
It is based on all the observations of the given variables and is 
independent of th© units in which the variables are measured. 

Probable Error of the Co-effldeut. 

Probable error is a measure which when added to or sub- 
tracted from a most probable measurement gives the limits 
within which it is probable that an item of the same nature,. 
if selected at random, will fall. 

Co-efficient of correlation also has a probable error. It 
is that amount which when added to and subtracted from the 
average correlation co-efficient gives amounts within wliich 
the probability is that a co-efficient of correlation from series 
selected at random from the same universe will fall. 

The formula for the probable error of Karl Pearson’s Co- 
efficient of Correlation is 

. 6745 ^^ 

V n 

where r is the co-efficient of correlation and n the number of 
items paired. 



804 


statistics: theory and practice 


The probable error of the eo-efii'cient of correlation, + .89, 
between the ages of husband and wife computed in example 1, 
will be 


.6745 


l-(.89)» 


l/l5 


:.036. 


The Co-efSeient of Correlation for the example under con- 
sideration should, therefore, be written as 

r= + .89±.036. 


It may be asked whether the positive correlation between 
ages of husband and wife is “significant.” Probable error 
supplies answer to this query. If in a given case, (1) r is 
less than the probable error, there is no evidence of correlation ; 
but if (2) r is six times the proj^ble error, correlation is signifi- 
cant; that is, its existence is a practical certainty. In example 
1, r is nearly 25 times the probable error. Correlation is, 
therefore, significant. We can now say that the co-efficient 
of correlation in example 1 actually lies between .926, 
(.89+.036), and .854, (.89 — .036), and that another co-efficient 
eonlputed from series chosen at random from the universe 
from which the given series was selected would fall within 
this range. 

To the two generally accepted rules for the interpretation 
of co-efficient of co-rrelation noted above, there might be added 
the further statements that, in those cases in which the prob- 
able error is relatively small, 

(1) the correlation should not be considered at all 

marked if r is less than 0.30, and 

(2) the correlation is decidedly existing if r is above 

0.50. 

It may be noted that the probable error at times leads to 
wrong results unless r is small and n is large. In order that 
the formula for co-efficient of correlation may yield satisfactory 
result, n should be considerably large. 



CORRELATION 


395 


Interpretation of Correlation. 

The above four rules must be kept in view while inter- 
preting the correlation co-efficient. When a correlation co- 
efficient, it may be added, is found to be significant, it should 
not be implied to mean more than what it does. For instance, 
in example 1, correlation between ages of husbands and wives 
is strong positive. It simply shows a connection between the 
two age series and does not necessarily mean that every young 
husband has young wife. Correlation is true on the average. 
A particular old man may have a young wife or two wives, 
one young and the other old. 

Again, if supply and price in example 3 are negatively 
correlated, it does not mean that increase in supply is the only 
cause of fall in price. There may be several other causes too 
leading to this particular ‘ effect ’. Similarly, if marks in 
Geography and Economics are positively correlated it does not 
imply that the two subjects are necessarily related as cause and 
effect. Knowledge of one subject may be helpful in the other, 
but the correlation may also be due to some third factor, c.g, 
adequate teaching in both the subjects. So a direct cause and 
effect relationship is not alivays and in all cases establi^ed 
by the fact that two series are correlated. 

Co^efficient of Concurrent Deviations. 

So far we were concerned with only one method of 
measuring correlation which may be termed as the Sum 
Product method, since the measure is dependent on the sum 
of the products of the deviations. If, however, a measure of 
association in the direction of change alone is desired, the 
method of concurrent deviations may be used. 

In example 3, we have used the modified method of 
measuring co-efficient of correlation for short-time oscillations. 
Co-efficient of concurrent deviations provides a much simpler 
method for the same study and gives satisfactory results in 
most cases. This method, however, is not suitable for dealing 



396 


statistics: theoby and peactice 


with long-tini'e changes, since it does not take account of the 
general trend. 

in comparing two historical variables relating to short 
time oscillations it is found that the two curves move in the 
same direction at the same time — ^that is, if the deviations are 
concurrent — ^there is a marked evidence of direct or positive 
correlation between the short time fluctuations. But, if the 
curves are steadily moving in opposite directions — ^that is, 
if the deviations are divergent — ^there is an evidence of inverse 
or negative correlation. To compute the eo-effiteient of correla- 
tion in such cases, we take into consideration not the deviations 
from the arithmetic means nor those from the moving averages 
but simply from the measurement of the preceding date re- 
corded. Secondly, we take into consideration not the size of 
the deviation but only its direction. The following empirical 
formula is used for the purpose of computing the eo-efiB.cient. 
This formula has the same characteristics as that of Karl 
Pearson, viz., +1 denotes perfect positive correlation, —1 indi- 
cates perfect negative correlation and 0 shows absence of 
correlation. 


If r=the co-efficient of correlation, 

n=the number of pairs of observations, 
c=the number of concurrent deviations, then 


The use of signs should be carefully understood. If the 


quantity 



is negative, a minus sign is placed before it 


and also before the radical so that the square root can be taken 
and the resulting co-efficient may retain the same sign as that 
of the original quantity. 


Example 5. Required to calculate the co-efficient of 
correlation from the data given in table 55 by the method of 
concurrent deviations. 



COEEELATION 


897 


Table 61 . Computation of Correlation of Short-time Fluctua- 
tions of Supply and Price by means of Concurrent 
Deviations. 


Year ^ 

Supply . 

X 

Price 

y 

Product 

xy 

Index of 
Supply 

Deviation 

from 

preceding 

year 

X 

Index of 
Price 

; Deviation 
j from 

1 preceding 

1 year 

! y 

1920 

91 


117 

! 


1921 

98 

+ 

97 

— 

— 

1922 

95 


102 

+ 

— 

1923 

92 

— 

108 

+ 

— 

1924 

93 

+ 

105 


— 

1925 

96 • 

+ 

96 

i — ^ 


1926 

102 

+ 

77 

j — 

— 

1927 

107 

+ 

68 

— 

— 

1928 

104 


77 

+ 

— 

1929 

98 

— 

93 

+ 

— 

1930 

100 

+ 

89 


— 

1931 

108 

+ 

83 

— 

— 

1932 

116 

+ 

78 

— 

— 

1933 

114 


84 

+ 

— 

1934 

111 


93 

+ 



71=14, since only the years 1921 to 1934 can be used in 
computing the co-efiScient. 

c= 0, since there are no pairs having like signs. 



=-v/l 


Therefore, there is perfect inverse correlation. 




898 statistics: theory and practice 

Oomlatioii by Orapble Method 

While discussing positive and negative correlation it was 
pointed out that one way of detecting the negative or positive 
character of correlation is to draw correlation graphs and 
read the direction of the curves. This method is illustrated 
in figure 30 in which monthly figures relating to volume and 
value of exports of rice by sea from India given in table 62 
are plotted. 


Tablci 62^ Foreign Sen^home Trade — Exports (Value and 
' Volume) 'of Rice (not in Husk) in 1941-^2, 


'Month. t 

Volume 

Value 


Tons. 

Rs. 

I 

(000) 

(00,000) 

April . . . . 

22 

32 

May . . . . 1 

29 

45 

J une . . 1 : 

22 

32 

July ; 

. 19 

29 

August . . . . I 

27 

44 

September . . \ 

43 

i 69 

'October -- . . I 

24 

40 

November . . | 

18 

i 29 

December . . . . , 

20 

i 31 

January . . . . ' 

23 

' 37 

February , . . . i 

32 

j 53 

MarQh . . . . j 

26 ! 

i 

1 43 

1 

Average j 

25.4 j 

40.3 


In drawing correlation graphs the choice of scales and 
base line should be so made that if lines representing averages 
of the two series are drawn parallel to the base they would 
be as close to each other as possible. There is no objection to 
tddng a false base line if it is required for bringing the two 
aYerage lines nearer each other. By drawing the curves on 

* Compiled from the Monthly Survey of Busmess Conditions in IndiOj 
August 1942. 




^CPERELATION .. 889 

such base line and scale their fluotuatians would be thrown into 
proper relief. 


Curves representing volume and value of rice exported 
' from India. 





Fig. 30 


In figure 30 the average lines are close to each other. 
They may have been brought still closer or made to overlap 
each other. But, thereby the two series, their nature being 
such, may also have overlapped each -other, at- least for a large 
part, and their graphic display would have been spoilt. The 
curves, as they are drawn, serve their purpose alright. They 
run remarkably parallel to each other throughout thdr up- 
ward and downward journey. They, therefore, indicate i)osi- 
tive correlation between volume and value of exports of rice 
during the several months of 194142. Similarly, with another 
correlation graph we may have found that both the curves 


400 


statistics; theoey and practice 


steadily ran in opposite directions, yielding negative correla- 
tion between the series. - 

But correlation graphs are not capable of doing anything 
more than suggesting the fact of a possible relationship 
between two variables. We can certainly note from them 
whether fluctuations agree throughout their courses, whether 
both of them rise and f^ll together, whethei* masima and 
minima occur at the same dates, and so on ; but, we can 
neither establish any causal relationship between the two 
variables nor obtain the exact degree of correlation through 
them. They pnly tell us whether the two variables are posi- 
tively or negatively correlated. 

It may be observed that in investigating causal relations 
ratios help more than quantities. If two variables are really 
related to each other, the proportional increase or decrease in 
one may vary directly with the proportional increase or 
decrease in the other. Consequently, resemblance between 
two curves may be brought out distinctly if the logarithmic 
scale is substituted for the natural scale. The same can also 
be done by reducing the two variables to index numbers and 
plotting them on a graph with common base line and common 
vertical scale. Preference may be given to logarithmic scale 
over the natural scale for the additional reason that wrong 
and deceptive conclusions might be drawn if scales are shrewd- 
ly manipulated and base lines not inserted correctly while 
using the natural scale. 

If two given series show fluctuations with time and we 
are interested in the correlation for long-time and for shoi’t- 
time changes a different method of comparison will be 
followed. 


Graphic Oorrelatioii of Time Changes. 

In examine 3 we have used a modification of Pearson’s 
metliod for. computing the correlation eo-effi'cient. The reason 
given for the modification was that if we desire to discover 



CORKKLATIOX 


401 


the relationship of short-time oscillations we should rid our 
data of the lon^-time variations. If the annual index numbers 
of supply and price in table 55 are plotted on a graph and 
tlieir correlation is studied, it would be found that it is nega- 
live, the two cui*ves moving in opposite directions. If long- 
])eriod changes are compared by plotting the 5-yearly avorageg 
given in columns 3 and 7 it will be seen that as supply shews 
a rising ti-end from 192U onwards the i)rice shows a downward 
tr(‘nd fi’om 1920 u])to 1928. P»ut, after 1928 ev(‘n with an up- 
ward trend of supply, the trend of the price is also upward, 
which fact may he due to increase in population, prosperity or 
change in demand. A comparison of the long-time changes after 
1928 would, therefore, suggest a positive correlation, while that 
of those before 1928, a negative correlat ion. lUit when the long- 
time variations are eliminated and we are left with short-time 
oscillations, as given in x*olumns 4 and 8. we may plot these 
deviations on a graph (as we did in figure 29) and obseiwe 
a marked relationship between the two curves, now completely 
unobscui-ed by long-time changes. We would see that when 
one curve l ises the other falls and vice versa. The fact of 
negative correlation would thus be made clear. 

It follows, therefore, that to study the correlation for 
long-time changes in a lime scries we should plot the moving 
averages of the two series and compare their directions, while 
to study the correlation for short-time oscillations we should 
plot the ileviat ion's from the trend and observe 4heir ^ 
movement.'’ 


“See in this connection ‘ t*ompari»on of time changes in two histoi’i- 
Chapter XVI. 

F— 2() 



statistics: theory and practice 


402 


EXERCISES 

(1) Discuss fully what is meant by the coefficient of correla- 
tion and how it is measured and interpreted. 

(B. Com.^ Alld., 1942). 

(2) Define correlation coefficient. What inferences can you 
draw from the values +1, 0 and — 1 of this coefficient. 

(5) What is correlation.^ Explain how you will use the 
following methods in determining correlation: — 

(i) Graph, (ii) Correlation table, (lii) Karl Pearson^s Co- 
efficient of Correlation. 

(B. Com., Agra, 1940). 

(4) Find graphically if the volume and value of imports of 
liquor (figures given in exercise 14, Chapter XV) are related to 
each other. 

(6) Find Karl Peason*s coefficient of correlation between 
capital outlay and gross earnings from the data gives in exercise 
1^, Chapter XV. 

(6) Find the correlation between exports and imports for 
1920-21 (figures given in exercise 7, Chapter XV). 

(7) Compute the coefficient of correlation of the short-time 
oscillations from the data relating to index numbers of X and Y 
(for the 1st 16 years only) given in exercise 3, Chapter XVI. 
’Assume 6-yearly cycle and ignore decimals. 

(8) Write notes on: 

Negative correlation, concurrent deviations, perfect correla- 
tion, correlation graph. 

(9) iri = 4.6 and 02 = 3.6 are the standard deviations of 

two groups ^ 2 , - and ^ 2 ? • • • - yn and =4800. 

n = 1000. 

Calculate the coefficient of correlation between the above two 
groups and interpret it. Also give the probable error of the co- 
efficient. 

(10) What is meant by the probable error of coefficient of 
correlation? Why and how is it measured? 

(11) Calculate the coefficient of correlation between the total 
receipts mid the passengers given in exercise 27, Chapter XI. 

(B. Com., AUd., 1982). 



OOBBELATION 


408 


(12) Calculate the coefficient of correlation between Industriid 
Production and Net Imports from the figures given in exercise 2, 
Chapter XX. 

(B. Com., Alld., 1939). 

(13) The following table gives the value of exports of raw 
cotton from India and the value of the imj>orts of manufactured 
cotton goods into India during the years 1913-14 to 1931-32: — 

(In Crores of Rupees) 


Year 

Exports of 
Raw Cotton 

Imports of manufac- 
tured Cotton Goods 

1913-14 

42 

66 

1917-18 

44 

49 

1919-20 

68 

63 

1921-22 

66 

68 

1923-24 

89 

66 

1929-30 

98 

76 

1931-32 

66 

68 


Calculate the coefficient of correlation between the value of 
the exports of raw cotton and the value of the imports of cotton 
manufactured goods. 

(M.A., Cal., 1937). 

(14) Calculate the coefficient of correlation from the following 

data: 

Amount of cheques cleared in Calcutta and Bombay Clearing 


Houses. 

Year 

Crores 

Calcutta 

of Rs. 
Bombay 

Year 

Crores of Rs. 
Calcutta Bombay 

1925 

.. 1018 

619 

1933 

. . 824 

646 

26 

969 

421 

34 

. . 864 

688 

27 

. . 1024 

398 

35 

. . 939 

750 

28 

. . 1088 

643 

36 

. . 899 

721 

29 

998 

800 

37 

. . 993 

837 

30 

893 

712 

. . 

• . 

• « 

31 

766 

640 

. . 

. . 

• • 

82 

747 

646 

, , 

, , 



(16) The following table gives five yearly percentage area in 
Bombay Presidency under cotton and under food-crops. Calcu- 



4fl‘4 


statistics: theorv and practice 


late tiu- ooefficlewt of corrclalioii between tlie area under cotton 
and tile area under food-erops: — 


Year 

Percentage area 

Percentage area 


under Cotton 

under food-cn>ps. 


37.7 

55.5 

li)0(> 

39.7 

52.5 

1J)07 

39.2 

52.S 

I9b8 

38.5 

52.7 

li>09 

38.5 

52.3 

IPIO 

38.8 

53.0 

1911 

37.8 . 

53.5 

1912 

39.1 

52.5 

1913 

39.5 

52.3 

1914 

38.0 

54.9 

1915 

38.4 

54.3 

1916 

38.8 

53.2 

1917 

39.2 

52.6 



(H. Com.. Alld.. 1935). 

(16) Calculate 

the coetficient of correlation bclwe«*ii llic cost 

of living* and the weekly wage rates from 

the following data: — 


C'ost <»f Living 

Index of Weekly 

Date 

t ndex 

Wagt* Rates 

1920 

151 

155 

1921 

110 

120 

1922 

102 

99 

1 923 

101 

98 

1924 

103 

101 

1925 

100 

101 

• 1926 

100 

102 

1927 

96 

100 

1928 

95 

99 

1 929 

95 

99 

1930 

87 

98 

1931 

84 

96 

1032 

81 

94 


(M.A., Alld., 1337). 



CORREIATIOK 


405 


(17) The following table gives the number of students hav- 
ing different heij^ts and weights. 


Height in 
Inches 


Weight in pounds 



80*90 

90—100 

100—110 

110—120 

120—130 

Total 

50-~55 

1 

3 

7 

5 

2 

18 

55—60 

2 

4 

10 

7 

4 

27 

60—65 

1 

5 

12 

10 

7 

35 

65—70 

— 

1 

3 ' 

1 

8 

1 

1 

6 

3 

20 

Total * 

! 

4 

15 ' 

! 

! 

37 

; 28 

i 1 

16 

1 100 

j 


Do you find any relation between height and weight? 

(B. Com., Alld., 1940). 


(18 ) Find the coefficient of correlation between Y (retail food 
price index) and X (wholesale food price index) from the foUow- 
ing table : — 


X 

89 

86 

74 

; j 

1 

65 I 65 63 

1 j 

i 

66 1 

1 

1 

1 

67 

' 72 ! 

79 

Y 

82 

9H 1 

! 

1 ! 

84 

( 

1 1 i 

75 I 73^ ' 72 

70J 

75 

1 

77i 

84 


(M.A., Alld., 1940). 

(19) Find the correlation coefficient between heights of father 
and son from the following data: — 


Height of father in 
inches 

65 

66 

67 i 

i 67 

68 

69 

71 

73 

Height of son In 
inches 

67 

f 

68 i 

1 

1 64 

1 

1 68 

72 

70 

69 

' 

70 


(M.A., Alld., 1940). 











406 statistics: theory ani> practice 

(20) Find the coefficient of correlation between marits obtain- 
’ed by candidates at an examination in two subjects A and B from 


the following 

data: — 









Subject A 
Mu. 50 

11—15 


21—25 

26—30 

31—35 

Total 

1— 5 





1 

1 

1 

i 

6—10 

1 

1 

8 

7 

1 

rr 

18 

. ^ 

11—15 

1 

2 

4 

14 

4 

25 

16—20 



7 

13 

6 

j 

26 

21—25 

1 


2 

4 

1 

7 

26—30 



1 

j 

1 

! 

1 

31—35 




1 


1 

Total 

2 

3 

22 

39 i 

13 

i 79 

1 


(B. Com., Bombay, 1936). 


(21) The following table gives the frequency, according to 
age-groups, of marks obtaim^d by 65 students in an intelligence 
test:— ^ 



Age in years 


Test Marks 










19 20 

21 

22 

Total 

800-r-260 

4 4 

2 

1 

11 

250— aoo 

8 6 

4 

2 

14 

300—350 

2 6 

8 

5 

21 

35O-HU)0 

1 1 4 

1 

6 

! 

8 

19 

ToUl 

10 I 19 

20 

16 

65 


1 














CORRELATION 


407 


Is there any relation between age and intelligence? 

(22) What are the assumptions upon which the Pearsonian 
coefficient of correlation is based? How does the positive correla- 
tion dilfer from the negative? Compute the coefficient of corre- 
lation of the short-time oscillations from the following data: — 


Year 

Supply 

Price 

1921 

80 

146 

1922 

82 

140 

1923 

86 

180 

1924 

91 

117 

1925 

83 

183 

1926 

85 

127 

1927 

89 

115 

1928 

96 

95 

1929 

93 

100 


(Assume a three-year cycle, and ignore decimals). 

(M. Cora., Alld., 1948). 

(28) From tlic following table, find out how^ far the fluctua- 
tions in prices correspond to the amount of money in circulation 
in India: — 



Rupees and Notes in 

Index Number 

Year 

(Mrculation in crores 

of Prices 
(1873 = 100) 

1912 

248 

137 

1913 

256 

143 

1914 

248 

147 

1915 

266 

152 

1916 

297 

184 

1917 

338 

196 

1918 

407 

225 

1919 

463 

276 

1920 

411 

281 

1921 

893 

260 


(B. Com., Agra, 1937), 



Marks m Mathematics 




statistics: theory and practice 


(24) Find tlie coefficient of correlation from the following 
table. 



(M.A., Cal., 1037). 


(2o) The following ta})le shows th<* distribution of marks. 
Calculate the coefficient of correlation and its probable error: — 

Marks iti (ieo(fraphtf 


nge of Marks ; 

0—20 

. 20 — to 

40—60 

60 — 80 : 

1 ; 

Total 

0—20 

32 

88 

15 

i 

1 i 

135 

20 — 40 

45 

436 

200 

i 4 

686 

40 — 60 j 

16 

500 

398 

25 

939 

60—80 ; 

— 

105 

' 532 

1 40 

677 

80 — 100 

! 

— 

8 

40 

1 ^ 

64 

Total i 

1 

93 

1,137 : 

1,185 

1 1 

1 8S I 

2,600 


(M.A.. Cal., 1935). 

(26) Calculate the coefficient of correlation between produc- 
tion of Pig-Iron (percentage of trend, 1897-1918) and Industrial 








CORREI^TIOX 


409 


Prodnotion (]M*Trentage of trend, 1897-1913) from the following 
table : — 



(27) (a) Discuss fully what is meant by the cwfficient of 
<orrclation and how it is measured and interpreted. 


(6) Calculate the coefficient of correlation from the following: 



Subject 

(Age of htisband) 

Relative 
(Age of wife) 



(B. Com., AUd., 1942). 









410 


statistics: THEOKY AND PRACTICE 


(28) What do you understand by coefficient of concurrent 
deviations ? 

Calculate the coefficient of concurrent deviations from the 
following data: — 

n or number of pairs of observations = 47. 
c or number of pairs of concurrent deviations =16. 

(29) Calculate the coefficient of concurrent deviations from 
the data given in exercises 16. 



CHAPTER XVm 


ASSOCIATION OP ATTRIBUTES. 

Statistics of Attributes. 

Statistical methods deal with quantitative data alone. 
Quantitative character of data may arise in two ways. 

First, the investigator may note only the presence or 
absence of some attribute in a series of objects or individuals, 
and count the number of those who possess it and of those who 
do not. For instance, in a given population the number of 
the deaf and not-deaf, or of the, sane and insane may be 
counted. In such cases, the quantitative character arises 
solely in the process of counting. 

Second, the investigator may note or measure the actual 
magnitude of some variable character for every one of the ob- 
jects or individuals observed. For instance, height of students 
in a class, length of leaves of a certain tree or prices of certain 
commodities may be recorded. Such records are quantitative 
in character. In these cases, therefore, the observations them- 
selves are quantitative in character. 

The first kind of observations are termed as Statistics of 
Attributes, and the second as Statistics of Variables. So far, 
in all the chapters, we have been concerned with statistics of 
variables. We have studied how variables are analyzed, com- 
pared. and correlated with one another. It is now proposed to 
study how relationship can be established between two attri- 
butes, and how that relationship can be discovered by the 
method of Association. 

Notation and Termiiiiology. 

While discussing classification of data according to attri- 

411 



412 


statistics: thisory and practice 


butcs in chapter Vlll it was pointed out that when one attri- 
bute is noticed, two distinct classes are formed. These two 
classes, however arbitrary their boundary, are exclusive of 
each other. Such classification was, in that chapter, denomi- 
nated as simple classification. It may also be referred to as 
division by dichotomy. 

To discuss the theory of association and its application in 
practice it is necessary to have some simple notation for the 
classes formed and for the measurements assigned to each of 
them. Accordingly, wc shall use the capital letters 

A, B, C to denote the several attributes. An object or 

individual pos^ssing the attribute A will be termed simply A, 
that possessing B, B. The class, whose members possess the 
attribute A will be termed the Claes A. Similarly, we shall 

use the small letters o, 6, c, (generally, the Greek letters 

a, p, y, are used) to denote the absence of the attributes 

A, B, C,.--*Thus, if A represents the attribute blindness, a 
represents sight, /.c., non-blindness; if 1> stands for insanity, 
b stands for sanity. CombinatiOiDa of attributes will be re- 
presented by grouping together the letters that indicate the 
attributes concerned. Thus, if A represents blindness and 
B insanity, AB represents the combination blindness ai\d in- 
sanity. If the presence and absence of these attributes are 
noticed, then 

Gombination AB stands for blindness and insanity 

„ A6 ,, „ blindness and sanity 

„ aB „ „ sight and insanity 

„ ab „ „ sight and sanity, 

and, similarly, the class AB includes all those wlio are blind 
and insane, the class Ab all those who are blind and sane, and 
so on. If a third attribute be noted, for example, deafness, and 
denoted by C, the class ABC includes those who are at once 
blind, insane and deaf, and ABc those who are blind and in- 
sane but not deaf. 



ASSOC IATION OF. AT’J’KIliUTF^S 


418 


The imiubei* of observations assij>’ued to any class will be 
termed . the frequency of the class, or bi-iefly class-frequency. 
(Hass-frequencies will be denoted by placing the corresponding 
class-symbols in brackets. Thus, 

(A) denotes number of A’s, /.c., objects possessing 

attiibute A. 

(A6) „ ,, „ Afc’s, „ possessing 

attribute A but not B, and so on. 

The attributes denoted by capitals ABC... .may be tei*m- 
(‘d positive attributeis, and their contraries denoted by small 
letters negative attributes. Thus the classes A, AB, ABC are 
Ijositive classes; the classes ab, abc, negative cla.ssrs. Al> 
and abf Ab and </B, A6C and aBc, are pairs of dontrary classes. 
A class specifying one attribute is known as the class of first 
order; while that specifying two, that of the second order. 
Thus, A is a class of the first order, AB or B(J that of the 
second order. Similarly, (A), (A6), (aBC) are class-fre- 
(luencies of the first, second and third orders respectively. 
The series of classes given by any one positive class and the 
classes whose symbols are derived therefrom by substituting 
small letters for one or more of the capital letters in all pos- 
sible ways will be termed as aggregate. Thus (AB), (A6), 
(«B), {ab) form an aggregate of frequencies of the second 
order. When no attributes are specified, the total number of 
(»bservations constitutes the Universe with its limits specified, 
and will be denoted by the letter N. 

It should now be clear that the Universe must be equal 
to the number of A’s plus the numbei* of a^s. Similarly the 
numbei* of A^s should equal the number of A’s that are B plus 
the number of A^s that are not B; and so on. It means that 
any class-frequency can be analyzed into higher class-fre- 
quencies. Thus, 

N ^{A) + {a), 

N + 

N - (AB) + (A6) + (aB) + (a.6), _ 



414 


statistics: theoey and practice 


(A) = (AB) + (A6), 

(B) = (AB) + (aB), 

(a) = (a6)4-(aB)=N-(A) 

The classes specified by attributes of the highest order 
are termed the ultimate classes and their frequencies, the ulti- 
mate class-frequcincies. If we know (AB) and (A6) we can 
find (A) ; and, if in addition we know (AB) and (cB) we^ can 
not only find (B) but also N. This is due to the fact, noted 
above, that every class-frequency can be expressed as the 
sum of certain of the ultimate class-frequencies. Therefore, 
to specify the data completely, it is only necessary to know 
the ultimate class-frequencies. An example will furthei* clear 
the point. 

Example 1. Given the following ultimate frequencies, find 
the frequencies of the positive and negative classes and the 
whole number of observations, N: — 

(AB)=100 ; (A6)=50. 

(aB) = 80 ; (o6) =40. 

The whole number of observations N is equal to the grand 
total: N = 270 

The frequency of any first-order class, e.g., (A), is given 
by the total of the two second-order frequencies the class- 
symbols for which contain the same letter. Thus, 


(A) = (AB) + (A6) =100+50=150 

(B) = (AB) + (uB) =100+80=180 

(а) = (aB) + {ab) =80+40=120 
or (a) =N~ (A) =270-150=120 

(б) = {Ab) + {ab) =50+40=90 
or (6) =N-(B) =270-180=90 


A a N 



The nine-squares table given above affords an easy and 
quick manner of getting the required class-frequencies. If 




ASSOCIATION OF ATTRIBUTES 


415 


given values are filled in it, the required ones may be eom- 
puted from it. 

Similarly, if eight ultimate frequencies of the third order 
are given, or sixteen ultimate frequencies of the fourth order 
are given, all the positive and negative class-frequencies and 
N can be obtained from them by mere addition. 

If the values of any two in each of the equations used in 
the solution of the above example are known, the value of the 
third can be easily found. For example, 
if, (a) =(aB)4-(a6) 
then (a6) = (o) — (oB). 

And, the expression of any class-frequency in terms of the 
positive frequencies is most easily obtained by a process of 
step-by-step substitution; thus 
{ab)-{a) - (aB) 

= [N-(A)J-L(B)-(AB)] 

= N-(A)-(B)-f (AB). 

The expression of other class-frequencies in terms of positive 
frequencies can be made by a similar process of substitution. 

Probability and Expectation. 

When a coin is tossed once, it must fall heads or tails. 
The probability (pure chance) that it would fail heads is 4 . 
When a coin is tossed 50 times, the expectation of a head 
coming up is 4X50=25. Therefore expectation is equal to 
the product of probability and the number of observations. 
If two coins are tossed, the chance of two heads or two tails^ 
coming up is reduced to 4 x 4 = 4 . 

If two attributes A and B are studied in a universe N 
and the class-frequencies of the attributes are (A) and (B), 

Probability of (A) = 

Probability of (B) = 



na 


S'I’ATISTICS: THEORY AND PRACTICE 


Probability of (A) an<l (B) (Vinbined— X 

N N 

And, P]xpe(*tatioii of (A) and (B) Combined™ X 

^ ^AXB 

N 


Oriterion of Independence. 

When actual o])servation is equal to expectation, attributes 
are independent and there is no association between them. 
In such a case we expect to find the same proportion of A^s 
amonp,st the B’s as amongst the non-B^s. 

Let us take an example: 

Example 2. If (A)=people vaccinated = 50 

(B)=people not attacked by small- 
pox =60 

(AB)=people vaccinated but not attack- 
ed =20 


N = Total number of people = 150, 

it is required to find whether the attributes A (vaccination) 
and B (freedom from attack) are independent. 


In this case, ex])e(*tati(m of ( AJ^) 


^ (A)X(Bj 
N 


50 X^ 
150 


-^ 20 . 


The actual observation (people vaccinated but not attacked) 
is thus equal to the expectation. Therefore, A and B are in- 
dependent. We conclude that vaccination and freedom from 
attack are not related to each other in the given case. 

Let us take another example : 


Example 3. If in the above example (AB) are not given, 
but instead, we know the number of people who were not 
vaccinated and were attacked by small-pox, {ah), to be 



ASSOCIATION OF ATTEIBUTES 


417 


Therefore, 


equal to 60, we proceed to find whether the attributes a and b 
are independent. 

Expectation of (efc) = ^ 

N 

Now (fl)=N- (A) =150-50=100 > • • , u- 

and (6) =N-(B) = 150-60= 90 • . 

Therefore, "6 

’ N 150 

Again, the actual observation is equal to the expectation. 
Jn this ease also the two attributes, a and b are independent. 

We can now put down the criterion of independence in 
more convenient form when actual class-frequencies of the 
second order only are given. 

Attributes A and B are independent if (AB), actual observa- 
tion, 

= (expectation). 

Attributes a and b are independent if (afe), actual observation, 

= ^ ( expectation ) . 

Attributes A and 6 are independent if (A6), actual observa- 
tion, '• "" 


(expectation). 


(expectation). 




- (expectation). 


Attributes a and B are indepe^qdent if (aB), actual observation, 

“ — - ^ ~ - ( expectation ) . 


TT ,^^(A)XCB) ^ (a)X{b) 

Hence {AB)X{ab) = N ^ N 

. _(A)X(6> ^ {a)X(b) 

N ^ , N 

But X = ( A6 ) X #') 

Therefore, (AB).X(o6) = (A6)X(aB) 

F — 27 



418 


statistics: theory and practice 


This last equation gives the required criterion of independence 
in the case of actual ultimate frequencies of the second order. 
We take one more example. 

Example 4. Let th-e actual observations be as follow: — 
I'eople vaccinated but not attacked by smalbpox=60= (^VB). 
People not vaccinated and attacked by small-pox ~2V2= (oA). 
People vaccinated but attacked by small-pox = 80 = (A6). 
People not vaccinated and not attacked by small-pox = 204 — (aB) 

It is required to find whether the attributes A and I> are 
independent. 

According to the criterion just indicated, A and B are 
independent only when 

(AB)X(a6)=:^46XaB 
Now, in the given case, 

(AB) X [ab) =60X272=16320 
(A6) X (aB) =80X204=16320 

The criterion is, therefore, satisfied, and, therefore, the 
attributes are independent, that is, not related to each other. 


Association and Disassociation. 

In statistics the word association has a technical meaning, 
distinct from the one current in ordinary speech. Ordinarily 
one speaks of A and B as being associated if they appear 
together in a number of cases. It is not so in statistics, where 
A and B will be said to be associated only if they appear 
logether in a larger numb(‘r of cases than is to be expi‘cted if 
they are independent. The mere fact that some A’s are IPs, 
however great the proportion, is not enough to show that 
A and B are associated. This is a fundamental principle. 

Association may be positive or negative. There is a simple 
way of knowing it. If two attributes, A and B, are not in. 
dependent, but related to each other, then if 


‘ (AB)> 


(A)X(B) 

N 



ASSOCIATION OK ATTEIBUTES 


419 


A and B are said to be positively associated. If, on the 
contrary, 


(AB)< 


(A)X(B) 

N" 


A and B are said to be negatively associated, or, briefly, 
disassociated. It should be carefully noted that disassociation 
does not mean the same thing as independence. 

In example 4, with the data as given, A and B are in- 
dependent ; that is, they arc not related. If the actual observa- 
tion of people who wwe vaccinated but not attacked by small- 
pox. that is, if class-frequency (AB), were more than 15 
(expectation), the attributes would have been related in some 
way or the other. But if the actual cases of (AB) were less 
than 15, A and B would have been disassociated. 


Upon the above principle we take an example. Let A=40; 
B = 35: 0 = 10; f>=15; AB = 30; Ab=10; a6=5; aB=5; and, 
N=50. 

We can construct a table like the following: 


B 

h 

N 


A 

a 

N 

30 

5 

35 

10 

5 

15 

40 

10 

1 

50 


Table X 
(Observation) 


We now find the expectations. 


Expectation of (AB) 


(A)X(B) _ 40 X 35 
■ N~ 50 


Expectation of 


(ab) = 


(a)Xib) _ 10X^5 
N 50 


= 3 


Expectation of (B) — ^-X50— 35 




420 


statistics: theory and practice 


Expectation of (A)=g^ X50=40 
And so on. 

Wo may now construct a table of expectations as below : 


Table V ^ ^ 
(Expectation) 


If we compai’e table X with table Y we shall be able to 
study the fact of positive and negative associations. In table 
Y, is 28, while in table X it is 30. This implies that 

actual observation of (AB) is greater than its expectation, or 
in other words 



(AB)> 


(A)X(B) 


Therefore, vaccination and freedom from attack, to make 
A and B stand for what they did so far in our examples, are 
positively associated. 

On the other hand, (A6) in table Y is 12 and in table X 
is 10. That is, actual observation is less than exi)ectation. 
Therefore, A and h arQ negatively associated. Or, in table Y 
aB is 7, while it is only 5 in table X. This implies that 




Therefore, a and B are disassociated or negatively associated. 


Co-effident of Assodation. 

So far we have ascertained the fact of asaoeiation by 
comparing the class-frequencies with the expectations. We 
have not measured the degree of association. Several co-effi- 
cients have been devised for judging the intensity of assoda- 




ASSOCIATION OF ATTRIBUTES 


421 


tion. Of these, the following co-eflRcient due to .Yule is the 
simplest : 

(AB)(flfc)-(Afe)(aB) 

^ (AB)(a6) + (A6)(aB) 

where, Q stands for the Co-efiS'cient of Association. This Co- 
efficient is zero when the attributes are independent, -M if 
they are completely associated and —1 if they are completely 
disassociated. * 

Let us take an example. We compute the Co-efficient of 
Association from the data given in table X of the above 
example. 

130X5) - (10X5) ^ 100 ^ I 
^ (30X5) + (10X5) 200 2 

Hence the intensity of association between the attributes 
A and B is i and the association is positive. 

Let us take another example. We compute the Co-effi- 
cient of association from the data given in example 1. 

(100 X 40) -(50 X 80^ _ 

^ (100 X 40) + (50 X 80) ' 

Hence the attributes A and B are independent. 

Yule’s Co-efficient of Association is quite easy to compute 
and is a convenient measure of association since it not 
only exhibits the intensity of positive and negative associations, 
but also shows the independent character of the attributes. 


Partial Assodation. 


If in a given case it as found that 


( AB) > or < 


(A) (B) 
N ■’ 


all that this information leads us to is that A and B are related 
with each other in some way. We cannot say whether the 
relationship is direct or of any other kind. It is i>ossible. that 
association between A and B may not be direct, but due to. 



422 


statistics; theoky and practice 


the association of A wth C and of B with C. An example 
will make the point clear. 

An association is observed between ‘ vaccination ^ and 
* exemption from attack by small-pox that is, more of the 
vaccinated people are exempt from attack than the nnvacci- 
nated o^es. It may be argued that this does not imply that 
vaccination protects the people from attack, but that most of 
the unvaccinated are drawn from the lowest classes, living in 
insanitary and filthy conditions. Thus A (vaccination) and B 
(exemption from attack) are associated due to the association 
of both with C (hygienic conditions). 

The ambiguity in the above case arises from the fact that 
the universe contains not only the objects posvsessing the third 
attribute alone, or objects not possessing it, but both. In our 
example, both hygienic and non-hygienic conditions may he 
prevailing in the locality where observations have been made. 
If, however, the universe of observation were confined to either 
class alone, for instance, the observations relating to vaccina- 
tion and attack were made from a narrow section of the 
population living under approximately identical hygienic 
conditions, and still A and B were found to be associated, the 
above ambiguity would not arise. 

However, the associations found between the attributes 
A and B in the universe of C^s and the imiverse of c^s are term- 
ed as partial associations, to distinguish them from total asso- 
ciations found between A and B in the universe at large. 

Partial association may prove to be entirely misleading, 
for what is true of the whole is not true of each of the parts. 
To take our example again, observations regarding vaccination 
and attack may be drawn from people living under same 
hygienic conditions, yet some of the people may be rich and 
others poor. There may be a positive association between 
vaccination and exemption from attack among the rich but 
not among the poor. The disparity between these results may 
be explained by the simple fact that the poor are moi'c open 



ASSOCIATION OF ATTRIBUTES 


423 


to attack than the rich, so that attack is not independent of 
poverty. Or, it may be that only the rich get themselves 
vaccinated, and vaccination is, in this case^ not independent 
of poverty. 

Thus an illusory or misleading association may arise in a 
case where in the given universe there exists a third attribute 
C with which both A and B are associated, positively or nega- 
tively. If both associations are of the same sign, the resulting 
illusory association between A and B will be positive; if of 
opposite signs, the illusory association will be negative. For 
example;, if the associations between A and C, and between 
f> and C are positive, they would give rise to an illusory 
positive association between A and B. 

Illusory association may also arise in a different maimer, 
that is, through the personality of the observer. If the atten- 
tion of the observer fluctuates, it is likely that he may observe 
the presence of A when he observes the presence of B, and vice 
versa. In such a case A and B will both be associated with 
the observer's attention C, and an illiLSory association will 
result. 


EXERCISES 

( 1 ) Examine statistically the efficiency of inoculation as a 
preventive against cliolera from the following data: — 

Of the total population of 8,100 in a village 1750 were inocu- 
lated against cholera, of whom 25 persons were attacked with the 
disease. Of the population not inoculated, 700 j>ersons were 
attacked. 

(2) Cdvi* the following ultimate class-frequencies find the 
frequencies of the positive and negative classes and the whole 
number of observations. N : — 

(AB) == 200 ; (\h) 100 

(flB) ^ 1(50 ; (ab) = 80 

(8) From the following data find whether the attributes A 
and B are independent: 

(A) = 100. (B) = 120, (AB) = 40, N == 300. 



424 


statistics: theory and practice 


(4) If (AB) = 120, (ab) =54E (At) = 1(10 and (aH»=408, 
find whether the attributes A and B ate independent. 

(5) Given the following ultimate , class-frequencies, find the 
frequencies of the positive classes. 

(ABC)= 298 (A6C)= 450 (aBC)= 408 (abC)=^ 342 

(ABc). = 147(5 (Abe) =2292 (nBc) =3524 (ahe) =43684 

(6) Given the following frequencies of the positive classes, 
find . the frequencies of the ultimate classes; 


(N) = 

47,426 

(ABC) = 

312 

CA)=' 

3,236 

(AB) = 

856 

(«) = 

4030 

(AC) = 

670 

(C) = 

1,540 

(BC) - 

312 


(7) Show whether A and B are indlependent, positively asso- 
ciated or negatively associated in the following cases: — 

.1 N =1000 (A)= 470 (B) = 620 (AB)=320 

’ll (AB)= 512 (flB) = 153(5 (Ah)= 96 (ah) =288 
III (A)= 245 (AB)= 147 (a) = 285 (aB) = 190 

(8) Investigate the association betw^een darkness of eye- 
colour in father and son from the following data. 

Fathers with dark eyes and sons with not dark eyes = 237 
Fathers with dark eyes and sons with dark eyes = 150 

Fathers with not dark eyes and sons with dark eyes = 267 
P'athers with not dark eyes and sons with not dark t yes = 2346 

(9) Given the following dlata find whether deaf mutism and 
baldness are associated: — 

Total population . . . . . . 16,264,000 

Number of the bald-headed . . 24,441 

Niunber of the deaf-mutes . . . . 7,623 

Number of the bald-headed deaf-mutes . . 225 

(10) Find the association betw'een eye-colour of husband and 
eye-colour of wife from the following data: — 

Husbands with light eyes and wives with light eyes . . 1236 
Husbands with light eyes and wives with not light eyes . . 856 

Husbands with not light eyes and wives with light eyes . . 528 

Husbands with not light eyes and wives with not liglit eyes 476 



ASSOCIATION OF ATTRIBUTES 425 

(11) Find the coefficient of association between inoculation 
and exemption from serious .tuberculosis from the following table: 



Cattle 


Died of Tu- 
berculosis or 
very serious- 
ly affected 

Unaffected or 
only sUglit- 
affected 

Total 

Inoculated with vac- 
cine 

18 

39 

57 

Not inoculated 

f 

; 9 

i 

1 88 

i 

Total 

( 

42 

1 

j 48 

1 .. 

90 


(12) The fallowing table gives the number of persons suffer- 
ing from certain infirmities in Bengal in 1931:- 


Sex 

! 

1 


j 

Deaf-mutes 

1 otal 

N umber' 

Insane 

: Deaf-mutes 

and Insane 

Males 

200 

lakhs : 

12650 

21,801 

545 

Females 

' 241 

- 

9,055 

14,186 

817 


Trace the association between insanity and deaf-muteness for 
males and females of Bengal separately. 

(M.A.. Alld., 19^8). 

(18) (a) Write a short note on the use of Coefficient of Asso- 
ciation in analyzing economic statistics. 

(6) From the figures given in the following table, compare 





426 


statistics: thkory and practice 


the association between literacy and unemployment in rural and 
urban areas, and give reasons for the difference, if any: — 


Total Adult Males 

Literate Males 

Unemployed Males 

Literate and Unemployed Males . . 


Urban 
25 Lakhs 
10 Lakhs 
5 Lakhs 
S Lakhs 


Rural 
200 Lakhs 
10 Lakhs 
12 l^khs 
4 Lakhs 


(M.A., Alld., 19S7}. 



Not attacked ! 

Attacked' j 

1 

Total 

Inoculated 

276 ; 

1 

279 

Not inoculated 

17H 

66 

539 

Total 

749 

i 

69 

818 


Find the association between inoculation against cholera and 
exemption from attack. 

(15) In tlie course of anti-malarial >vork quinine w^as admi- 
nistered to 006 adults out of a total population of .‘1,5 10. The 
incidlencc of malarial fever is shown below. Discuss the preven- 
tive value of quinine. 


Quinine 
No Quinine 


ever 

No- fever ' 

'Fotal 

19 

587 

606 

193 

2,741 

2,934 

212 

; 3,328 

3,540 


Total 


(M.A., Cal., 1935). 





ASSOCIATION OF ATTRIBUTES 


427 


(K)) Criticize the followinjt*: arguments; — 

(1) 99 per cent of the people who take alcohol die before 

they reach the age of 80 years. Therefore, taking 
alcohol is bad for longevity. 

(2) .99 per cent of the members who voted for the tenancy 

bill were cultivators. Therefore it was unfair to 
suppose that the voting was unbiassed. 

(17) Out of 14 thousand litt rates in a certain district of 
India, there were 100 criminals. 

Out of 186 thousand literates in the same district, there were 
thousand criminals. 

Is there any association between illiteracy and criminality 

(18) Investigate whether there is any association between 
extravagance in father and son from the following data: — 


Extravagant sons with extra\agant fathers . . 496 

Miser sons with extravagant fathers . . . . 162 

Extravagant sons with miser fathers .... . . 184 

^liser sons with miser fathers .. 1158 



CHAPTER XIX 


INTERPOLATION AND PORBCASTINtt 

Interpolation stands for the insertion of the most likely 
estimate under certain assumptions. In chapter X, the mode 
and the median were interpolated in the modal and the median 
classes respective!}'; but, this was done only by starting with 
certain assumptions in both the cases. In locating the mode 
in a continuous frequency distribution it is assumed, as it was 
done in chapter X, that mode is influenced by the class-inter- 
vals adjacent to the modal class; while, in locating the 
median in a similar series it is assumed that the magnitude of 
the median class is luiiformly distributed over its frequencies. 
Location of the mode and the median in a grouped distribiitiou 
suggests examples of interpolation, as also the usefulness of 
this device for estimating some missing figure in a series. 

Necessity of Interpolation. 

In the absence of complete data at our disposal there 
would be no way out. except that of resorting to interpolation, 
to find the values of mode and median. Hence the necessity 
of this method in such cases. But there are cases other than 
these where gaps ma}' have to be filled in. Such gaps may 
be due to the fact that no record has been made, or its details 
are insufficient, or it has been lost or destroyed. Cases in 
point arise in connection with returns like those of the census 
which are, and can be, taken only once in a few years, so that 
if population figures are ■wanted for any intervening year, 
as they are in several instances, an estimate has to be mad" 
of the most likely figures from the results already recorded. 
For exa-mple, it may be necessary for purposes of administra • 


428 



INTER POI^TIOK AND FORECASTING 


429 


tion or the like for a local or central government to be able 
to know with a reasonable degree of accuracy the population 
of an urban or rural area, or a province, at any given time, 
or to know the area under particular crops, or the area under 
irrigation. Similarly a sociologist, an economist or a busi- 
nessman may be interested in knowing a likely estimate of a 
certain phenomenon he is concerned with. A sociologist may 
like to know the number of people in dilferent age-groups 
during the intercensal period, an economist may desire to have 
a knowledge of the total tax-revenue raised in a certain year, 
while a businessman may like to fill up the gaps in his. yearly 
sales register. In all these cases it cannot be supposed, 
without any valid re^ison, that the figures relating to a past year 
would apply to the year whose figures are required to be esti- 
mated. Nor can mere imaginary figui’es be relied upon. A 
most likely estimate has to be made. 

Such an estimate may relate to some past date or to 
future one. The technique of estimating a past figure is term- 
ed as Interpolation while that of estimating a probable figure 
for the future is called Extrapolation. To make an estimate- 
certain assumptions are necessary. 

Assumptions. 

The first assumption that is made in interpolation or 
extrapolation is that th ere are no sudden jumps fr o m on e 
pe riod to agoth er. If population figui^es for India for 1911, 
1921, and 1941 are given and an estimate has to be made for 
the figure for 1931, this would be done only when it is assum- 
ed that there was no violent disturbance in the intermediate 
dates, nor was the year 1931 an exceptional year such as that 
affected by epidemics, war or other calamity. 

The second assumption is that in the absence of evidence 
to the contrary the rise o r fall has b een uniform . That is, in 
our example, the population growth has to be assumed to be 



430 statistics: thkohy and pkactick 

uniform between 1921 and 1941, if the year 1911 or some other 
information has not to modify this assumption. 

Accuracy of InterpolatioiL 

Upon the above assumptions figures may be iuterpolateil, 
but the question that arises is that, what is the certainty 
that the interpolated figures, which by hypothesis are un- 
known, are in reality the most probable figures? In the 
words of Dr. Bowdey, the accuracy of interpolation depends 
(1) on knowledge of the possible fluctiuitions of the figure^, 
to be obtained by a general inspection of the fluctuations at 
dates for which they are given; (2) on knowledge of the 
cour se of the even ts with which the figures are connected.'** 

It follows, therefore, that in basing arguments upon such 
figures the fact that they are interpolated ones should not 
be lost sight of. Interpolated figures are based on quite a 
different class of evidencH' from those which result froju 
direct evidence. In some instances interpolations may repr(‘- 
sent figures which do not exist and wliich are used only for 
convenience of calculation. For instance, in allotting raonrhly 
marks to a student who was absent from a few semin^n*^i, at* 
tention may be paid to the student's general place in the class 
and to the avei*age mai ks got by the students present in thv»sc 
particular seminal's. Marks thus allotted have no existenci*. 
In other cases, inti'epolated figures may l^e, in the absence of 
of the knowledge of full facts, most probable estimates of 
figures that really exist. Therefore, all such estimates 
must be indicated as interpolations; it is always 
better to point the method by which they are o])tained. 
If any subsediary information, which may be regard- 
ed as a direct evidence of the accuracy of interpolated figures, 
is available it is well to state it also. Ihirther, if practicable, 
interi>olated figures should be stated not as exact ones, but as 


Bowley, A. L., Elemonts of Statistics, 3920 ed.^ p. 217. 



INTKHPOJ^VTION AND 1<X)KECASTING 431 

lyin ^4 in a ranf>e within which their accuracy may not be 
questioned. 

Methods of Interpolation. 

Figures can be interpolated by the ^rHi)hie method or by 
algebraic treatment. Graphic method is good to follow when 
the quantities show cyclical character. We shall discuss be- 
low, with suitable examples, the graphic method, the method 
of fitting a parabolic curve, the method of advancing differ- 
ences (Newton's method) and the liagrange’s formula. 

The Graphic Method. 

Graphic Method in a continuous series. — This method may 
be explained by an example. Table 611 gives the population of 
the province of Bengal during the last seven censuses. Column 
1 of the table shows the independent variable {x) which ad- 
vances by an equal inerement of 10 years. Column 2 shows 
the corresponding values of the dependent variable (y). 


'Fable (la. Population of Benqal. 
1881—1941. 


Year 

Population 

in lakhs 

.r 

y 

1881 

■m 

1891 

391 

1901 

421 

1911 

455 

1921 

467 

1931 

501 

1941 

603 


Suppose jthe figure 421 for the year 1901 is not given, and 
we are I’equired to interpolate it. The task of interpolation 



432 


STATISTICS!. THEOllY AND' PKACTICE 


would deinjud upon the e.videuce available for the purpose. If, 
for example, we know only the figures 391 and 455 relating to 
the years 1891 and 1911 respectively, we may plot them on a 
graph paper with years on the base line- and population oh the 
vertical scale. Since oiriy a straight line would result from 
‘joining the points, we have no alternative but to assuim* 
that the population between 1891 and 1911 rises iit an .unifpriti 
Wte, In the absence of any information to the contrary this is 
fhe most correct assumption possible. The height of the ordinate 
drawn from 1901 to intersect the straight line would give us 
an estimate of population of Bengal in 1901. This would be 
4554' 391 

equal to or 423 lakhs, which ^fexceeds the actual 

figure (421 lakhs) only by 2 lakhs. The difference is not very 
great, the mistake being of a little less than .5 per cent. If 
an estimate of population for any other year during the inter- 
eensal period is required, an ordinate from the particular year 
may be raised to intersect the straight line. The height of 
the ordinate on y would give the figure for the particular year. 


If, on the other hand, figures, are available for all the 
years, excepting 1901, the various y’s shall be plotted against 
their respective %'s on a graph. If the resulting points are 
joined as straight lines with a ruler, we will have to assume 
sudden jump^ in the growth of population, at least at the 
figures for the years 1911, 1921 and 1931, i.e at points C, D and 
E in figure 31. AVe have read that rather than making such 
ail assumption, we should assume that in the absence of evi- 
dence to the contrary sudden changes in the quantities from 
one period to. another do not occur. . Therefore, instead oi 
joining the points by straight lines we should draw through 
them all a line whose curvature is as smooth as possible. Such 
a curve may be constructed on mathematical principles or 
.drawn freehand. In figure 31 it has been drawn freehand, 
copnectihg each (%, y) point. To find the y proper ty 1901, 
we have drawn an ordinate through 1901 intersecting the 



INTERPOLATION AND FORECASTING 


433 


curve at P. The height of this ordinate which is 421 gives 
the figure in lakhs for the population of Bengal in 1901. The 
remarkable closeness with which the interpolated figure agrees 
with the actual one given in table 63 is more or less accidental. 


JnterpoJeiion of Population of Bengal 
by Graphic Method. 


Population in 



m mi 1901 m i9zi mt mt 
Years 
Fig. 31 


The principle followed 
in figure 31 is not an un- 
reasonable one to adopt, 
for, in effect, it gives due 
weight to each of the 
observations (y’s) actu- 
ally recorded, and it 
assumes an even course 
from each year to the 
nexl — a quite justifiable 
assumption in the absence 
01 any evidence that some 
sudden discontinuity or 
break has taken place in 
the j’s. 

Graphic method and 
periodic figfures. — If we 

have a series of monthly 
averages of figures relat- 
ing to a certain pheno- 
menon, sa^ sales of silver 
or price of wheat in 
India, and the averages 
show peiiodic fluctuations, 
which we can study by 


the method discussed while dealing with analysis of time 


series, we can interpolate figures for any month for which 


F— 28 



434 statistics: theory and practice 

records are incomplete. This can be done with a fair degree 
of accuracj*. In general, the amount of sales of silver in India 
would show a rise during summer every year for that is the 
marriage season in the country, and the prices of wheat 
would show a fall in April-May when new crop appears on 
the market. The curves drawn for such phenomena would 
exhibit a kind of periodicity, they would regularly rise 
and fall. This would enable the filling in of unknown figures 
in a manner which would not be unsatisfactory. For, if we 
know that a curve ought to rise or fall up to a certain limit 
to be in conformity with its periodicity, we get a re- 
liable clue to the position of the missing figures. Further, 
even those figures which lie at the two ends of our series of 
averages, and which probably cannot be found by any other 
method, can be traced out by the graphic method, when once 
the cyclic character of the curve has been known. We shall 
see later the usefulnss of such curves for purposes of fore- 
casting. 

Graphic method and Oorrelation Curves. — If, by the use of 

correlation graphs, discussed in chapter XVII, we are able to 
find a close connection between two series, we can use one of 
them, whichever is more complete, to help in interpolating a 
missing figure in the other. First, we should carefully study the 
closeness between them at the dates for which we have com- 
plete figures in both series, and then draw a figure similar to 
figure 17, one of the lines being, of course, incomplete. There- 
after, we may complete the incomplete line in such a manner 
that the complete line would be in close resemblance with the 
other line. Thus, we shall obtain the most probable values for 
the figures which are missing in the incomplete series. This 
1 method can be very usefully employed in interpolating figures 
for the values of exports from those of imports, for the amount 
I; of money in circulation from figures of prices, for the produc- 
I tion of sugar in India from statistics of imports of sugar, for 
* changes in parts of the, population from changes in the whole. 



INTERPOIATION AND FORECASTING 


43d 


and for many other series only if we know that correlation 
exists between them. 


Graphic method of interpolation is possible only in the^ 
case of a continuous series, since only a continuous series, and 
not a discrete series, is capable of being represented by a^ 
curve. We cannot, for instance, interpolate the population 
figure for India for a certain year if we know the population 
figures for various other countries of the world ; but we can 
fairly correctly estimate the population figures for India in a 
given year on the basis of population figures for the country 
for several years. 


y 


Algebraic Treatment 

The problem of interpolation to which greatest attention 
has been paid is as follows : 

If one quantity is subject to continuous regular change, 
and a second quantity changes in connection with it, and if 
we know or can estimate directly only some discontinuous 
value of this second quantity, then it is required to estimate 
the most probable value of the second quantity corresponding 
to given values of the first. For instance, given the annual 
premia payable on a life policy at ages 25, 30, etc. years, it 
is required to interpolate the premium for intermediate ages; 
or, given the population of India in 1881, 1891, 1901 and 
1911 it is required to interpolate it for intermediate dates or 
extrapolate it for future dates. 

Two assumptions, as already noted, are made in such cases. 
Firstly, it is assumed that the quantity (premium, or popula- 
tion in the above examples) changes continuously, that is 
without any sudden break at any figure. Anjl, secondly it is 
assumed that the rate of change of the quantity is likewise 
continuous, so that the curve representing it is smooth, and 
not angular. 

The problem stated above can be tackled systematically 
by using the algebraic method of finite differences. 



436 


statistics: THEORY AND PRACTICE 


W'e take up below three of the several methods avaHable 
for interpolation. 

Pirrt Method— fitting with a parabolic cuiwe. — To make 
the argument as general as possible we shall speak of x and y 
as variables, and assume the value of y as depending on that 
of X in such a way that when x is given, y is known or can be 
estimated. 

Suppose 

y=(i+bx-\-cx^+ 

where o, 6, c. . . .are constants to be determined, and their 
number can be made to depend upon the number of known 
values of y. The ► equation 

y^a-\-bx+cx^+ nx'^ 

represents a curve called a parabola of the order. 

Let us illustrate the method by fitting a parabolic curve 
-to the following figures giving the population of Allahabad at 
decennial censuses : — 


Year 

1901 

1911 

1921 

1931 

Population in 

172 

171.7 

i 

157.2 

183.9 

thousands 


It is required to interpolate the population for 1916. 
Assuming that no abnormal conditions prevailed in 1916 to 
cause a sudden change in the population of Ii Ao^ l^us proceed 
to estimate the population for that year with the help of the 
given data. Since the known points are four in this particular 
case, we would take as the curve through them a parabola of 
the 3rd order, viz^ 

y=^a+bx+cx^+do^ ( 1 ) 

Then, the four known points would be just sufficient to deter- 
mine the four consonants a, 6, d. Now, the x class-intervals 




INl-ERl'Olj^iTlON AND FORECASTING 


437 


are equal, being of 10 years each; we measure from 1916 as 
origin, and get 

-15, -5, 0. -1-5, -fl5 

y= 172, 171.7, yo, 157.2, 183.9 

where y# is the number to be estimated. 

To further simplify the algebra we may take 5 years as 
unit for x, so that . . 

*= -3. -1, 0, -fl, +3 

y= 172, 171.7, yo, 157.2, 183.9 

Since all five points are to lie on the curve with equation 
as in (1),. !we substitute the above values of x and y in the 
equation, and get 

172=a-35-f9c-27d (i) 

171.7=a- b+ c-d (ii) 

yo=« (iii) 

157.2=a-t- 6-f c-fd (iv) 

183.9=a-f36+9c-F27d (v) 

Adding (i) and tv), 

2a + 18c= 172 +183.9 (vi) 

Adding (ii) and (iv), 

2a+ 2c= 171.7+ 157.2 

or 18a+18c=9(171.7 + 157.2) (vii) 

Subtracting (vi) from (vii), 

16o=9 (171.7+ 157.2) - (172+ 173.9) 

=2960.1-355.9 
= 2604.2 

Therefore, a= 162.7625 
or, yo = 162,763 

The population of Allahabad, as interpolated, is 162,763 
for 1916. 

In a like manner interpolation of population for any 
other intercensal year or extrapolation for any year after 1931 
can be made. 



438 


statistics: theory and practice 


Second Method — means of advancing differences. — 

This method is also known as Newtou’s method. Ihe follow- 
ing figures, table 64, show the amount of annual premia re- 
quired by an insurance company to secure Rs. 1,000 without 
profits. It is required to calculate the amount of premium 
payable at the age of 22 next birthday. 


Table 64. Annual Premia on a life policy of Rs. lOOO- 


Age next 

Annual 

1 Differences 

birth-day 

premium 











in years 

in Bs. 

First 

Second 

Third 

Fourth 

Fifth 

% 


y 

L_ A 

A’ 

A“ 

A* 

A‘ 

20 

Xq 

25 y„ 











25 


28 

3 

A. 

1 

A“« 







30 

Xi 

32 y.. 


A. 

1 

A=. 

0 

A’. 

.5 

A*« 



40 


37 y. 

s 

A. 

1.5 


.5 

A*. 

.25 

A*. 

-.25 

A*. 

45 

X 4 

43.5 ! y. 

6.5 

A. 

2.25 

A’a 

.75 

A'2 





50 


52.2o j y^ 

8.75 

' A. 










Each entry in the difference columns is formed by taking 
the algebraic difference of the entries on the left. Thus, 


Ao — yi —ya — 28 — 25 — J; Ai — yz ~ y\ =32 — 28 = 4 ; 
A*ci=Ai ■" Ao — 4 — 3 = 1 ;A*i = A 2 — Ai 6 — 4 = 1 ; 
A^ = A^-A^,= 1 - 1 = 0 ; A“«= A* 1 - A* 0 = . 25 -. 6=^25 

In this manner, differences hav 3 been calculated in 
columns 3, 4, 5, 6 and 7 of the table. 

The formula to be used for interpolating the value of y for 
a given x due to Newton is. 




— yo+xAo+- 


1X2 


A*o+ 


x{x-l){x- 2) .3 

1X2X3 ^ " 


It is required to know the amount of premium payable at 
age 22 years next birthday 

Now, from the table we find. 


22 - 20 2 . 

^- 25 ; = 

Ao=3; A'o=l; A®o=0; A%=.5; A=o= -.25. 






INTERPOIATION AND FORECASTING 


439 


Hence np to this order of differences, the required amount 
of premium, found by substituting these values in the 
formula, is 


=25+ (.4X3)+- 


U-1) .4 (.4-1) (.4-2) 

1X2 1X2X3 


XO 


.4 (.4-1) (.4-2) (.4-3) 
1X2X3X4 


X.5 


.4 (.4-1) (.4-2) (.4-3) (.4-4) 


1X2X3X4X5 


X-.25. 


= 25 + L2+-~f +0+ X--.25. 


= 254-1.2- 
= 26.05. 


•.124-0-.0208-.0075 


The required annual premium payable at age 22 years is 
Rs. 26.05. 

Interpolation for the value of > for any other x can also 
be made in a like manner. 

Newton’s formula uses differences running in a diagonal 
direction, and is suited for interpolation near the beginning 
and end of the table. It should be noted that it is used in a 
ease in which the independent Aariable x advances by equal 
increments. In table 64, x advances by 5 years. When it is 
required to interpolate in a frequency distribution it is better 
to work with the cumulative numbers. For e:;ample, if from 
the frequency distribution of marks given in table 22 it is 
desired to know the total number of candidates who obtained 
marks not exceeding 15, the table, for purposes of calculating 
the differences and of interpolating the number of students, 
would be written as follows (table 65), and interpolation 
carried out as above. 



440 


ST^'ISTICS: THEORY AND PRACTICE 


Table 65. Cumulative frequency of marks in Economics. 


Number of marks 

X 

No. of candidates 

y 

Not more than 10 

4 

20 

12 

„ 30 

23 

99 99 99 ^’O 

38 

^9 99 99 30 

49 

99 99 30 

56 

^ „ ,9 70 

60 


Third Method— Lagraogfe’s formula.— When the recorded 
y’s correspond to x% and the x’s advance by unequal intervals, 
the moat convenient formula to use is that due to the famous 
French mathematician, Lagrange, known after his name as 
Lagrange’s formula. 

Representing the quantities as before by 


(^0, To), {Xi. Ji), {Xi, Ji,) 

Lagrange’s formula runs thus: 


yn) 9 

• • • • {x A „ ) 

^ {Xo-X^) ixo—Xi) 

+ (a:-a;o) (*—*2) 


. . . . ( Xq Xa) 

. . (a;-Ar„) 

'‘(*1—^:0) (Xi-Xi) 

+ 


. . . . (X,-Xn) 

, (x — Xo) (x — Xi) 


. . . (x-x„.,) 

^^"{Xn—Xo) (Xa — Xi) 




The following table relates to income earned per month 
by a certain number of workers in a big manufacturing 
concern. 




INTERPOI^TION AND FORECASTING 


441 


Table 66. Monthly income of workers. 


Income nont 
exceeding 
Rs. 

Number of 

1 peons 

1 

y 

15 

Xo 36 

yo 

25 

1 40 

yi 

30 

X2 i 45 

72 

35 

*3 ' 48 

1 

1 

ys 


- It is required cu estimate the number of workers getting 
not exceeding Rsl 26 per month. 

Making use of the dat^ given, and taking x=26f we have, 


_ (26-25) (26-30) (26-35) 

^ (15-25) (15-30) (15-35) 

(26-15) (26-30) (26-35) 
(25-15) (25-30) (25-35) 

^^^(26-15) (26-25) (26-35) 

(30-15) (30-25) (30-35) 

a (26-15) (26-25) (26-30) 
(35-15) (35-25) (35-30) 


= 36X 


1X-4X-9 

-10X-15X-20 


+40X 


11X-4X-9 

10X-5X-10 


+ 45X 


llXlX-9 

15X5X-5 


11 XIX— 4 
'119X10X5 


54 


125 




792 297 

25 + 25 


sae 

125 




442 


statistics; theory and practice 


% 


Therefore, income not exceeding Rs, 26 is being earned by 
workers, as interpolated. 


The above example relates to a case of frequency distribu- 
lion where the magnitude of different class-intervals is not 
equal. If, instead of frequency distribution, individual items 
vrere given, for instance, population of India during certain 
years, the years advancing not necessarily by equal intervals, 
tlie method of attacking the problem would remain the same 
as that ill the above example. In a like manner extrapola- 
tion can be made. 


Forecasting. 

While discussing graphic method of interpolation used in 
connection with periodic figures it was pointed out that when 
the cyclical character of a curve has been ascertained, it is 
easy to locate a missing figure, and it was also hinted there 
that such curves can prove useful for forecasting. Indeed 
they can, f or it h as been found that econoiniQ^jevents move iiJL 
a cycle. Periods of industrial boom or of agricultural depres- 

sions have been found to repeat themselves at an uiterval of 
7 — 10 years. Therefore, when once it has been found, as a 
result of the study of sufficient and reliable' data, that a certain 
phenomenon is characterised by cyclical tendency, 
its future course can be fairly accurately predicted 
upon the strength of past knowledge. This pre- 
diction is nothing but forecasting. Many businessmen 
can make forecasts about the future of their business without 
actually drawing a curve or even without knowing the name 
of periodicity. For instance, a chemist knows it well that 
malaria season is the one in which sales of quinine would be 
the largest during a year, and a bullion merchant always 
expects a rise in the price of silver during the months from 
March to June with the approach of marriage season in India. 
Thus, every business has its season. The practical business- 
man knows the facts of ups and downs in business from his 



INTERPOLATION AND FORECASTING 


443 


experience; to him a periodicity curve or forecast based on 
it is of little interest. But to a statistician or an economist^ 
the knowledge of periodicity is of great assistance in predict-! 
ing the course of many economic events and taking advantage 
of them. 

But even businessmen in the western countries are making 
use of what are called Ec onomic Baromete rs. These baro- 
meters are special compilations made for the purpose of 
indicating tendencies of economic events. The construction 
of Indices of Business Conditions has already been explained in 
Chapter XII. It has been pointed out there that business in 
general passes through well-defined minor and major changes, 
which fact makes business forecasting possible. How this 
forecasting is actually done has been discussed in Chapter XIII 
while dealing with the indices of business conditions prepared 
in England and the U.S.A. The Harvard Committee on Econo- 
mic Research publishes these indices in the form of charts. 
These charts help business forecasting. Similarly, Forecasting 
Composite Line published by the Brookmii e Economic Service 
helps in forecasting stock and commodity prices in the U. S. A. 

Again, in Chapter XV while dealing with logarithniic or 
ratio scale charts it was pointed out that such charts, once 
their fluctuating character has been studied, can be extendxl 
to predict a future flgure relating to the phenomenon they 
illustrate. This is, again, done with a knowledge of the t end 
of the curve. In flgure 21, the dotted line shows the manner 
of extending the curve beyond the last date upto which evi- 
dence is available. This is a method of extrapolation which 
leads to forecasting. From the extended portions of the 
curves in flgure 21 it can be predicted that the amount in 1943 
would rise to Rs. 50.64 and Rs. 506.4 respectively in the two 
cases. 

Conclusion. 

After a discussion of the various methods of interpolation 
and extrapolation the practical utility of these methods must 



444 statistics: theoky and practice 

now be clear. In administration as well as in business, main- 
tenance of minute records of each item from , date to date is 
a matter of time, labour and money. It is impossible to take 
yearly census of population, for instance. But population 
figures are, year after year, necessary for estimating the in- 
come from existing taxes, for exploring the possibilities of 
new sources and estimating the additional income. How use- 
ful can the methods discussed above can be for estimating the 
I)opulation year after year can be very easily seen. Similarly 
a businessman’s sales records may be incomplete or he may 
like to estimate the probable demand for his wares in the 
coming year. Methods such as those discussed above would 
provide him with a most likely estimate based on past experi- 
ence. So, these methods are of immense practical use. 


EXERCISES 

(1) What is interpolation? Explain its necessity by taking 
a few examples. 

(2) What are tlie assumptions that are made in interpolat- 
ing missing figures in a series? How far are interpolated or ex- 
trapolated figures to be relied upon? 

(3) What is extrapolation? Show by taking a few examples 
how extrapolation leads to forecasting. 

(4) What are Economic Barometers? How far do they help 
in forecasting economic events? 

(5) Give a few examples of the use of Interpolation in Busi- 
ness Statistics. 

(M. Com., Luck., 1942). 

(fi) Explain fully the process of interpolation by graphic 
method. 

(7) What different methods used for the interpolation of 
population statistics are known to you? Discuss their merits. 

(8) Explain the use of graphic method of interpolation 
when a given series is periodic in character. 



INTEliPOIATlOX AXI) FORECASTIXG 


445 


(9) Interpolate the population of India for 1901 by graphic 
method using the following data: — 


Year 

Population in 


millions 

1881 

253 

1891 

287 

1901 

? 

1911 

315 

1921 

319 


(10) The following are the annual premiums required by the 
Bharat Insurance Co., Ltd., Lahore, to secure Rs. 1,000 with-profits 
by making 20 payments in all. What would be the premium pay- 
able at age 26 next birthday? 


Age next birthday 

20 payments 


Rs. 

As. 

20 

36 

25 

39 

f 2 

30 

42 

13 

35 

47 

6 


(B. Com., Agra, 1932). 


(11) Hoav is the population of any country in the inter-censal 
period estimated? 

The following table gives the population of India during the 
last four censuses; — 


Year 

Crores 

1901 

29 

1911 

31 

1921 

32 

1931 

35 


Estimate the population of India in 1936. 

* (B. Com., Agra, 1933). 



M6 


statistics: theoky and practice 


(12) The following are the annual premiums in a certain life 
Insurance Co., for a policy of Rs. 500/- payable at the death 
with an agreed bonus: — 


Age next birthday 

1 

j 25 

i 

i 30 

i 

35 

40 

1 

1 

j 45 

Annual Premium . . 

24/10 

1 

27/11 

31/9 

36/6 1 

i 

42/5 


Calcutta the premium at age S6. 

(M. Com., Luck., 191*2). 

(13) The folio .ving table gives [he different premiums' at dif- 
ferent ages in a Lift: Assurance Co. 


Age 

Premium (in Rupees) 

25 

23 

30 

26 

35 

30 

40 

35 

45 

42 

50 

51 


Find the premium at ages 28 and 55. 


(15) The following tabic gives tlie quantities of a certain 
brand of tea demanded at prices noted against each. Estimate 
the probable demand when the price is Re. 1/14*/- a lb. 


Price of Tea per lb. 
(in Rs.) 

j Re. 1/4 

1/8 

1/12 

2/- 

i 

2/4 

Quantity demanded 
(in thousand! lbs.). 

82.5 

70.8 

63.1 

! 55.0 

48.9 


(M.A., Alld., 194*2), 









INTEUPOIATION AND FORECASTING 


447 


(16) The following table shows the annual rounded off value 
of production in a factory for the period 1925 — 1935. Estimate 
the missing value for 1930. 


1925 

. . 6,005 

1931 

. . 5,347 

1926 

, . 5,012 

1932 

. . 5,516 

1927 

. . 5,031 

1933 

. . 5,733 

1928 

. . 5,068 

1934 

. . 6,004 

1929 

. . 5,129 

1935 

. . 6,335 


(iVLA., Cal., 1937). 


(16) The following table gives the population of Lucknow 
at the time of last six censuses: — 


1881 

1891 

1901 

1911 

192’’. 

1931 


2,53,729 

2,64,953 

2,56,239 

2,32,332 

2,17,167 

2,51,097 


Estimate the population of Lucknow in 1941 by Graphic 
method only. 


(B. Com.. Lucknow 1938). 

(17) Discuss the utility of interpolation and extrapolation to 
a businessman. What are the different methoils known to you 
for interpolation.^ 

Interpolate the figures for 1921 by the algebraic method of 
finite differences: — 


Year 

Population of India 

1901 

_ 294,261,056 

1911 

315,156,396 

1921 

? 

1931 

351,523,045 


Test the validity of your method if you know the actual census 
figures for 1921. 


(M. Com., Alld., 1943). 



448 


statistics: theory and practice 


(18) State Newton's formula for interpolation. Calculate 
the sale of silk in 1928 from the following data: — 


Year 


Total sales 
(in Rs. 1,000) 

1927 


233 

1929 


391 

1931 


582 

1933 


799 

1935 


1035 


(M.A., Cal., 1936). 

(19) The following table shows the value of an immediate 
life annuity for every 100 paid:- — 


Age in Years 

40 

50 

60 

70 

Annuity (£) 

1 

(5.2 

7.2 

9.1 

12.0 


Interpolate for tlie ages 42 ax^d 69. 

(M.A., Cal., 1D36). 


(20) Explain the need foi interpolation in statistical work, 
and the assumptions made in using interpolated values. The fol- 
lowing table gives tlie consumption of a certain chemical in tons 
per year in a factory. Find the missing value for 1927: — 


Year 1923 1924 1925 1926 1927 1928 1929 1930 

Tons 500 699 1098 1699 ? 3504 4711 6119 


(M.A., Cal., 1935). 


(21) Estimate the probable number of persons earning be- 
tween Rs. 40 and' 50 from the following data: — 


Income in Rs. 
Relow Rs. 20 
Rs. 20-— 40 
Rs. 40 — 60 
Rs. 60 — 80 
Rs. 80—100 


Number of persons 
120 
145 
200 
250 
150 






INTERPOLATION AND FORECASTING 


449 


(22) Interpolate the missing figures in the following table of 
rice cultivation; — 


Year 

1911 

1912 

1913 

1914 

1915 

1916 

1917 

1918 

1919 


Acres in millions 

76.6 

78.7 
p 

77.7 

78.7 
p 

80.6 

77.6 

78.7 


(B. Com., Agra, 1937). 

(23) The following table gives the census population of a 
certain town in 1891, 1901, 1911, 1921, and 1931 Estimate the 
population in 1925, making your method clear: — 


Years 

Population 

1891 

98,754 

1901 

132,285 

1911 

168,076 

1921 

195,690 

1931 

246,050 


(M.A., Cal., 1937), 

(24) The following are the marks obtained by 492 candi- 
dates in a certain examination: — 


Not 

mor.e 

tlian 

40 

marks 

212 candidates 




45 


296 




50 


368 

. . 


.. 

55 


429 

V 


,, 

60 

•>9 

460 

9} 

99 


65 

V 

481 

99 

99 


70 

99 

490 

99 

99 

-9 

75 

99 

492 


Find out tlie number of candidates who secured more than 
42 but not more than 45 marks. (M.A., Cal., 1935). 

F— 29 




CHAPTER XX 


INTERPRETATION OF DATA 

I 

The Science of Statistics, as we have defined it in 
chapter IJ, is concerned with the collection, analysis and intei*- 
pretation of quantitative data. So far, we have been dealing 
with the various statistical methods employed in the collection 
and analysis of numerical facts. Jt is proposed to deal here 
with the interpretation of statistical data. 

Interpretation. 

Interpretation stands for the technique of dravdng out 
inferences from an analytical study of the collected figures. 

While discussing the methods of collection, presentation, com- 
parison, correlation, interpolation etc., we have put down, 
wherever necessary, in appropriate places the essential pi-c- 
cautions which must be kept in view in handling statistical 
facts for analysis as well as for commenting on the results ot 
analysis. A repetition of all those cautions here is obviously 
uncalled for. Butatjnust be said thaL commmisens'e. . is. as. 
much a chief requisite and experience as great a teacher hi 
the delicate task of interpretation as they~a4'e in collection and 
analysis of quantitative data. It this fundamental principle 
is ignored, fallacious inferences would be drawn, which would 
recoil on the statistician and his science. The statistician, to 
repeat what we have said in chapter IJ, is not an alchemist 
expected to produce gold from any worthless material ; he is 
rather like a chemist capable of assaying the value the material 
contains and of extracting nothing more than this value. 
Freedom from bias and prejudice on the part of the statistician 
is, no doubt, necessary in collection and analysis of data; it 
is all the more so in interpretation, because it is interpretation 

450 



INTEUPRETATION OF DATA 


451 


more than — or, rather than — collection and analysis of data 
with which the layman is concerned. The potoer which figures 
carry with them is such that the layman can be as easily 
impressed by them as he can be deceived. From the advertiser's 
gallary, from the electioneering platform, from tne propa- 
gandist ’s forum, from the partisan press and from a hundred 
other sources the man in the street is bombarded with tenden- 
tious figures put forward to support some ex parte statement 
Sometimes such figures are reasonably and justifiably used to 
form a basis for the arguments built upon them; more often 
they give an exaggerated picture of the truth, which may be due 
to ignorance or inadvertence, but has also been found to be 
influenced by motive, l)y deliberate wish to mislead. The 
layman is tiot unaware of this fact. If he distrusts all argu- 
ments based on figures, his attitude is like that of a reasonable 
man, who has not the training for himself to separate the 
wheat from the chaff, and is, therefore, inclined to suspect 
everything. ^Statistical methods are most dangerous tools in 
the hands of the inexpert. A taxi driven by one who does 
not know the art of driving might fall in a ditch or collide 
against other vehicles resulting in serious injury to the 
passei-s-by, the taxi and the driver. Taxi must therefore be 
driven ])y one expert in driving. So must statistics be 
handled by one who is expert in the subject. Most often 
what happens is that attracted by the power of impression 
which statistics command so many men are led to use them 
without knowing their limitations, and feel jubilant if they 
win their point. At other times, it happens that when 
the data have been scientifically collected and analyzed they 
fall into the hands of those, who hardly know the subject, for 
purposes of interpretation. These people sometimes under the 
impress of their preconceived notions, and at other times due 
to their habit of criticising every thing tliey come across 
indulge in hair-splitting as if they only know the art of 
interpreting statistical data. It is, therefore, established that 



452 


STATICT1C8: THEORY AXD PRACTICE 


if the task of interpretation is to be scientifically performed, 
it must be done by a true statistician who is above all preju- 
dice and has the daring to call a spade a spade. 

Preliminaries to Interpretation 

Before starting on interpretation the statistician should 
examine whether 

(1) the data are adequate to base his judgement upon; 

(2) the data are homogeneous and comparable; 

(3) the data are properly collected and are without 

biased errors; 

(4) the data have been scientifically analyzed, and all 

disturbing fact(»rs considcml. 

After satisf^vnig himself on these preliminary points he 
should begin with the drawing out of reasonable infercnc*<\s. 
Most of the mistakes that are made in interpreting figurative 
data arise from false generalisation, a few oxampjes of whicdi 
may now be considered. 

Mistakes due to False Generalisation. 

Let us suppose that an argument runs as follows : 

The prices of agricultural commodities m India in J943 
have increased five times the prices in 1931. Jndia'r pros- 
perity in 1943 has, consequently, increased by leaps and 
bounds. 

The argument as it stands looks sound. Let it be agreed 
that the ratio of prices between the two years is correctly 
stated. Now, 1931 was a year of depression, when agricultural 
commodities were, indeed, selling at very low prices, and 1943 
a year of war boom. Therefore, the two periods 
are different, and allowance must be made for 
this difi'erence before right comparison is possible. 
Again, are the prices of agricultural commodities a measure 



IXTERPRETATION OF DATA 


453 


of India’s prosperity? Let it be .supposed that they are a 
measure of the prosperity of agriculturists in India. But, is 
the whole of India agricultural, or cwily a part of it is so? 
Evidently, the whole of India is not. Then, is what is true of 
a part necessarily true of the whole? The answer is in the 
negative. Further, eveiii supposing that the income of agri- 
culturists in 1943 is much more than what it was in 1931, does 
that measure their prosperity^? The agriculturists are to spend 
also, and if in spending the^y pay six times more on the same 
items in 1943 than what they did in 1931, do they retain 
money with them or lose? They do the latter. And, what is 
prosperity — the mere income,, or the surplus income? Lastly, 
what is the significance of ‘leaps and bounds’? ‘Leaps and 
bounds ’ may ))e an impressive term, but is meaningless to the 
statistician unless he knows the bounds of ‘ leaps and bounds.’ 
These querries will suffice to understand what false generali- 
sation may lead to, and in what direction should the mind 
of a statistician woi-k to arrive at correct inference. 

Let us take another example. It may be argued that the 
production of foodstuffs in a country in a certain year was 
only .5 per cent less than similar production in the previous 
year. There was, therefore, no real food shortage in the year 
under reference. Again, the argument creates an impression. 
But Jet us analyze it. Is the number of 
mouths to be fed in the country in the parti- 
cular year the same as it was in the previous year, or 
has it increased? If it has increased, the demand for food- 
stuffs, other things being equal, is expected to increase. 
Again, were the foodstuffs exported from the country in the 
previous year? And, have they been exported in the year 
under consideration? If they were not exported in the pre- 
vious year and have been exported in the particular year, 
or if the exports in the particular year exceed the exports 
in the previous year, the shortage of .5 per cent would increase 
to a higher figure so far as actual consumption is concerned. 



454 


statistics: theory and practice 


Further, were foodstuffs imported into the country in the 
previous year? If yes, have they been imported in the parti- 
cular year under study. If not, there would be shortage for 
purposes of consumption. And, if the imports in the year under 
consideration are much below those in the previous year, 
shortage for consumption would result. These points would 
serve to make it clear that the task of interpretation i^^ 
not strewn with roses; it needs an analytical approach. 

Another example may be found in the argument that 
since the quaiititiy and value of goods imported into a cer- 
tain country have been increasing, the country .is prospering. 
Supposing the figures of imports are correct, the question arises ; 
how much of the imported goods are being re-exporte<i ? 
Suppose all are being retained within the country. Then, is 
the consumption of goods made in thei country increasing, 
constant or decreasing? If it is increasing or constant, tlie 
per capita consumption of foreign and native goods is increas- 
ing, and if increased per capitu consumption is a measure of 
prosperity, the country is prospering. But per capita con- 
sumption would increase, remain consatnt or decrease accord- 
ing to the change that takes place in the population of the 
country. If the country is being colonized and the increase 
in the number of immigrants far outweighs the proportionate 
increase in the imports, it is dififfcult to say that the prosperity 
of the country is increasing. Again, if the consumption of 
native goods, population remaining the same, decreases, the 
imports may be just suffiy^ient to recompense this decrease, and, 
therefore, per capita consumption may not increase. There 
would then be hardly any increase in prosperity. But if the 
imports are in excess of the decrease in the consumption of 
native goods, the chances are that per capita consumption, and 
with it the prosperity, would increase. Again, let us look 
.at the problem from another angle. If, along with increasing 
imports, the consumption of luxury goods made in the country 
is also increasing, it is difficult to conclude that per capita con-* 



I KTKK PRKTATION OF DATA 455 

sumption is increasing. Luxury goods are consumed by the 
classes and not by the masses, but the classes may form only 
a small portion of the country. What is true of them is 
not necessarily true of the whole country. If the increment 
in their prosperity is nullified by the loss in the prosperity 
of the other section, national prosperity would not increase. 
These lines of thought would show the amount of care and 
caution required in interpreting quantitative data. They also 
show that arguments, though seemingly correct, may be highly 
fallacious, if they are not properly sifted and analyzed. 

Wrong Interprettation of Index Numbers. 

We have already dealt with the. limitations of averages 
in chapter X, and indicated there the fallacious conclusions 
to which a careless interpretation of averages might lead. 
Also, with regard to the use of both the general price and 
the cost of living indices we have given adequate caution in 
chapter XII. We may here take an example of wrong inter- 
pretation of index numbers of prices. If a general index 
series shows a rising tendency in a country, it may be argued 
that since price level, as indicated by the index series, is 
rising, there is infiation in the country. The argument here 
runs from effect to cause. This manner of arguing things is 
rather questionable and unsound. An effect may be the 
result of a multitude of causes. To single out one cause 
from this multitude, without valid reasons and corroboratory 
evidence, is not justifiable. Moreover, index number reveals 
a change in the relative value of the standard of deferred 
payments and of goods in general — the two sides of the 
quantity theory equation, it is not to be inferred that a 
change in the level of prices is necessarily due to causes 
directly touching the value of the standard rather than to 
causes touching the value of the goods, which are being com- 
pared with the standard. The index number merely reveals 
the change in relationship; the cause of the change is another 



456 


statistics: theory and practice 


question. Therefore, it is not safe to say that if an index 
series shows a rising tendency the cause for rise in price level 
is the increase in quantity of money pushed into circulation,. 
Index number simply shows the tendency. 

Wrong* Interpretation of Coefficient of Correlation. 

Goeffi'eient of correlation simply shows that two variabhs 
are related to each other. The value obtained for the co- 
efficient in a certain case should be interpreted with great 
caution. Suppose a negative correlation is found to exist l)e- 
tween area under jute and area under rice in Bengal. The 
argument may run that the cultivation of jute in Bengal is 
increasing at the expense of lice. This might imply that th(‘ 
people of Bengal want more jute foi* manufacturing jute 
cloth, gunny bags, sand bags, etc. than rice for direct con- 
sumption. This implication may prove incorrect, if, on 
further investigation, it is found that increase in area under 
jute is due to war emergency requiring jute manufactures, oi* 
due to relative higher prices of jute, or due to such climatic 
changes in the province as favour the growth of jute more 
than that of rice, and the cultivation of rice has gone down 
at the same time either because of rice plots having gone 
under a crop other than jute, or because of the i*elative fall 
in price of rice or of rice-growers having joined the armed 
forces of the country. So, although the ai*ea under jute is 
increasing while that, under rice is decreasing, it does not 
necessarily mean that jute is being grown at the expense of 
rice. From the above line of arguments it follows 
that the interpretation of the value of coeffi'eient of 
correlation needs not only caution, but also a thorough 
knowledge of the facts that constitute other hypotheses. 

Wron^ Interpretatlan. of CoeffieieiiLt of Assocaation. 

While dealing with association of attributes in chapter XVIll 
we made reference to partial association. We pointed out 



INTEEPKETATION OF DATA 


457 


the reasons! to which illusory association may be due, and the 
fallacious conclusions to which such an association might lead. 
One more example may be taken. 

It is observed, at a general election to the provincial 
legislatui'es, that a greater proportion of the Hindu Maha- 
sabha candidates, who spent more money than their oppo- 
nents, the Congress candidates, won their election than the 
Congress candidates who spent less. It is argued that the 
Hindu Mahasabha candidates won because of their having 
spent more than the Congress candidates. That is, there is a 
positive association between “ spending more than the oppo- 
nent ” and “ winning.” The argument would be sound only 
if, on further investigation, it is found that these two attri- 
butes are not influenced by a third attribute. If a third attri- 
bute influences them, the coefficient of association would work 
out to be a high figure even though winning ” and “ spend- 
ing more thandhe opponent” are not related. For instance, 
the policy, principles and programme of the Hindu Maha- 
sabha may have generally carried the day, and ‘‘ spending 
more ” had nothing to do with its success. 

(General directions for Interpretation^ 

In all the above examples we have not doubted the 
accuracy of the data. We have rather supposed that the data 
are correct, properly collected and presented. But, even with 
correct data we have seen how wrong and fallacious conclu- 
sions might be drawn. It will be seen that in all these cases 
what appeared to be correct at first sight was not necessarily 
so when further investigations were made. Therefore, a 
statistical conclusion must be based on all possible investiga- 
tions. In other words, statistical results should not be con- 
sidered as the sole determinants of the value of given data. 
Statistical treatment affords only one method of judging a 

’Bead in this connection ' ‘ Limitations and Distrust of Statistics’’ in 
chapter III. 



458 statistics: theory and practice 

phenomenon. It is not the only method available. There- 
fore, conclusions based on statistical analysis mean nothing; 
more than what figures imply. A s^tisticjan cannot be- dog;- 
mati^abou^ 1^ conclusions. He cannot, and should not, 
j assert that his figures tell that a certain result mmt be such 
/and such. It may be; it may ‘not be. It will be, only when 
it is confirmed by other methods of studying the same pheno- 
menon. This great limitation on the interpretation of statis- 
tics should, therefore, be always kept in view. Again, 
statistical laws are true only on an average, or in the long 
run. They study the norm, and not the abnormality. Statis- 
tics deals with the group and not the individual. These, facts 
must not be ignored while interpreting and using statistical 
data. 


EXERCISES 

(1) What kind of mistakes are generally made in interpreting 
statisical data.^ Give examples. 

(B. Com., Alld., 1936). 

(2) What conclusions would you draw regarding the economic 
activities of the people living in the U. S. 8. R. (Russia) from 
the study of figures given in the following table: — 

(1928=^100) 



1929 

1930 

1931 

1932 

1933 

1934 

1935 

Industrial Production . , 
Output of Investment, 

126 

164 

203 

231 

250 

300 

369 

goods 

.Output of Consumers’ 

131 

185 

240 

279 

307 

382 

481 

goods 

122 

147 

172 

190 

200 

230 

274 

Net Imports 

92 

111 

116 

74 

37 

24 

25 

Net Exports 

114 

128 

100 

71 

61 

52 

45 


(B. Com., Alld., 1939). 

(3) What do you understand by interpretation of Statis- 
tics and) how is it to be done.^ 



INTERPRETATION OF DATA 


459 


(4) How would you interpret the following table giving age- 

distribution per thousand of the population in 1911? 

Age 

Germany France England U.S.A. 

Japan India 

Under 10 

234 

171 

209 222 

244 276 

10—20 

203 

166 

190 198 

198 192 

20 — 30 

164 

158 

173 187 

154 178 

30 — 40 

139 

148 

152 146 

138 142 

40—50 

105 

127 

115 106 

101 99 

50 — 60 

76 

104 

80 72 

77 61 

60—70 

51 

77 

51 43 

57 36 

70 and 

over 23 

49 

30 26 

31 16 

(5) Interpret the facts contained in the following table: — 


Population in 

Mean density 

No. of females 


1931 

in millions 

per sq. mile 

per 1000 males 

India 


352.8 

195 

940 

Bengal 


50.1 

646 

924 

Madras 


46.7 

328 

1025 

U. P. 


48.4 

456 

902 

Bihar & Orissa 

37.6 

454 

1005 

Bombay 


21.9 

177 

901 

Ajmer 


0.5 

207 

8.92 

Assam 


8.6 

157 

900 


(6) The following is an extract from an office draft of an 
Annual Report of a large Public Library. Recapitulate the 
essential information in the form of a tabular statement and bring 
out impressively the comparison attempted in the Report. 

. Reading habits among borrowers vary from year to year, Topi- 
cal events leave their impress on the number of borrowers and more 
particular!}^ on the damage inflicted on bo<>ks borrowed. 

Whilst in 1988, only 15,000 books were lent out the stress 
of events in 1939 attracted no fewer than 380,000 borrowers. 
These latter indented as many as 25,500 and eitlier lost or damag- 
ed 800 books. In 1938, there were only 1,20,000 borrowers, 48,000 
borrowing 4,000 books dealing with Section ¥ (Illustrate News). 

In Section E (Travel) 1,000 were lent out in 1938, but the 
number increased to 2,000 in 1939 while the number of borrowers 
increased from 2,000 to 10,000 and the losses diminished from 
5 to 2 or by 60 per cent. 



460 


statistics: theoey and practice 


Section D is made up of Pamphlets. In 1938 issues amount- 
ed to 2,000 books and borrowers 26,000. The statistics only one 
year later were 3,000 books and 60,000 borrowers with 100 losses. 
In Section D, 40 books were lost in 1938 while in Section F, 680 
losses among 2,80,000 borrowers were recorded in 1939. It may 
be noted that in this sectSon the issues in 1939 were 14,000. 

Section C (Biography) was in 1938 responsible for 1,000 
books, 2,000 borrowers and 2 losses; but in 1939, although the 
books had increased in number by exactly one half, the number 
of borrowers remained exactly the same as before, and so also the 
number of losses. In Section B (Science) there was no increase 
over the, 3,000 in 1938 but 15,000 borrowers in 1938 increased 
by 3,000 in 1939. The losses were exactly double, 3 in 1938 
and 6 in 1939. Curiously enough in Section A (Fiction) the 
number of books and books lost which were respectively 4,000 and 
20 in 1938, were reduced in 1939 to exactly one half of these num- 
bers. The number of borrowers also diminished from 28^000 to 
10,000 in 1939. The total number of books in 1938 was 170 of 
which were recorded in Section F. 

(M. Com., Luck., 1942). 

(7) What inference da you draw regarding the Indian Busi- 
ness Activity from the following indices of ** Capital ” Index of 
Indian Industrial Activity. 


August 

1939 



110.1 

September 

1939 



117.4 

October 

1939 



113.0 

November • 

1939 



111.9 

December 

1939 



121.3 

J anuary 

4940 



116.6 

February 

1940 



116.9 

March 

1940 



112.1 

April 

1940 



117.2 

May 

1940 



122.0 

June 

1940 



115.8 

July 

1940 



115.7 


(8) What are statistical fallacies? Give examples mentioning 
the factors responsible for them. 

(9) Comment on the following conclusions: — 

(1) Population of Cawnpore doubled during the decade 
1931-41. Therefore, the birth-rate for the town has 
also doubled. 



INTEKPKETATION OF DATA 461 

(2) The export of gold from India is increasing. The 
people of India are, therefore, getting poorer. 

(8) The national income per head in India has now increas- 
ed to Rs. 100 from Rs. 30 in 1900. Therefore, the 
people of India are now more happy. 

(4) The income from stamp duties has been increasing in 
India. Therefore, the number of suits filed in courts 
is increasing. 

(10) How would the present World War influence: 

{a) Population Census of 1951. 

(h) Marriage rate in India. 

(c) International Comparison of Statistics. 

(d) Life Insurance. 

(11) It is observed that intelligent fatliers have intelligent 

sons, and intelligent grandfathers have intelligent grandsons. 
Therefore intelligence is hereditary. Comment. 


Months 

Notes in Circula- 

Bombay Wholesale 

1941-42 

tion (Crores of Rs.) 

Price Index No. 

April 

249 

122 

May 

255 

123 

June 

260 

127 

July 

257 

140 

August 

258 

144 

September 

266 

145 

October 

274 

152 

N ovember 

284 

162 

December 

304 

180 

January 

328 

184 

February 

349 

194 

March 

357 

197 


Calculate the coefficient of correlation from the above data 
and fully discuss whether the coefficient indicates that the rise in 
wholesale prices at Bombay is due to inflation. 



APPENDIX I A. 


SPECIMEN OF A BLANK-FOEM. 


The following blank-form was used by the Central 
Bureau of Economic Intelligence, United Provinces, in an 
enquiry into family budgets of mill workers in the United 
Provinces. 


I. FOOD 


ir. FUEL. 


Article. 


Quantity 


Cost. 


AVheat 

Wheat Flour 
Gram 

Gram Flour 

Birra (Bejliai) 

Kice 

Barley 

Maize 

Juar 

Bajra 

Dal urd 

Dal Arhar 

Dal Munjy 

Dal ( ) 

Ghee 


Md. 


Rs. 


I 

i 

i 


Oil ( ) 

Milk 


Sugar ' 

Gur 

Meat I 

Fish 

Eggs 

Potato 

Other Vege- 
tables i 

Salt 
Spices 
Sweetmeats 
Fruits 
Tea 


Article. j Quantity! Cost. 


I Md. I Rs. 
Firewood 1 ! 

Coal : ! 

Dung-cakes ; 

Total i 


111. 

LIGHT 

Article. 

Quantity.' Cost. 

Kerosene oil 

oil 


Matclies 


Total 

I 

! 1 

IV. HOUSE RENT 

Rent 

Repairs Total 


Total 


462 




APPENDIX 


463 


V. CLOTHING AND FOOTWEAB. 



Article. 1 No. Cost. 

Life. Cost per month. 


Rs. a. p. 

Months Rs. a. p. 


(«) MEN 


1. 

Dhoti 


2. 

Pyjama 


3. 

Shirt 


4. 

Saluka 


5. 

Waist coat 


6. 

Coat 


7. 

Underwear 


8 

.Dhusa (Lohi) 


9. 

Napkin 


10. 

Rumal 


11. 

Sottks 


12. 

Shoes j 

! 

13. 

Chappals 

i 

14. 

Safa 1 


15. 

Cap 


16. 




Total M. 1 

! 



(b) W’OMEN 

i 

17. 

Sari 

i 

18. 

Pyjama 

' 

19. 

Lahanga 

' 

20. 

Shirt 

1 

21. 

Saluka , 

i 

22. 

TJrhni 

1 

23. 

Burka 


24. 

(^li appal 

1 

25. 

Stockings 

1 

i 

26. 

i 



Total W". 


(c) CHILDREN i i 

i i 

27. 

Dhoti I 

i ; 

28. 

Sari 

^ j 

29. 

Lahanga 


30. 

Pyjama 


31. 

Shirt 


32. 

Saluka 


33. 

Urhni 


34. 

Shoes 


35. 

Chappal 


36. 

37. 

Cap 



Total C. 



464 


statistics: theory and practice 


VI. HOUSEHOLD REQUISITES. 

receipts from home. 


Article. 

No. 

Cost 

Life 

Cost per month. 

1. 

Charpai 


Bs. a. p. 

Months 

Bs. a. p. 

2. 

Ee-netting 





3. 

Dari 





4. 

Kathri 





5. 

Bazai 





6. 

Sheets 





7. 

Blanket 





8. 

Utensils 





9. 

Tinning 





10. 

Umbrella 





11. 

Mattresses 





12. 

13. 

, Huqqa 





Total 








APPENDIX 


465 


VII. MISCELLANEOUS 




03 

E. 







Cost. 


5 



Bh. a. p. 




J. Bwwper 



02 I 
fcc 1 
c . 


2. Barber 



ei , 

X! + 


3. Dhobi 





- - - - - - 

4. Soap 



o 

w a 
p. s 


5. Hair oil 




6. Medicine 

7. Education 



Total 

Exp. 


8. Conveyance 





!♦. Travel 


X 

ac 


10. Tobacco 



- 



11. Pan Supari 

12. IntoxicAnts ( ) 


CC 

►H 

a. 2 

w 



t-H 

U 


13. Recreation 



.2 


14. Ceremonials 



r 

w 


15. Remittances 



! 




c ! 

16. Postage 

17. Subscription 



“ i 



^ I 

18. Newspaper 



’§} 1 
’►3 ! 

19. Litigation 



1 

20. Interest 





21. Debt 



Ph 


22 

Total 

! 


Food 



F— 30 



APPENDIX I B 


SPECIMEN OF A QUESTIONNAIRE 
FOR THE CONSUMPTION HUDOET OF AN ARTISAN’S 

FAMILY 

A. PRELIMINARY. 

1. Naims of the village, pargana, tehsil and disteiet. 

2. Nearest post office, police station and railway station. 

3. Name of the head of the family. 

4. Religion and caste. 

5. Numlxr of members in the family residing with the 
wage-earner. 

Adults: men and women. 

Children (under 12): boys and udrls. 

6. Number of dependents not living with the wage-earner. 

Adults: men and women. 

Children (under J2): boys and girls. 

7. Number residing outside the village and contributing 
to family income. 

B. MONTHLY FAMILY INCOME. 

J. Normal monthly earnings of the head of the family. 

2. Similar earnings of other members of the family in 
the village. 

3. Contribution to family income by those residing 
abroad. 

4. Other .sources of income. 

Kind 

Cash. 


466 



APPENDIX 


467 


0. EXPENDITURE OV FOOD. (MONTHLY) 

Quantity and Cost of: 

1. Rice. 

2. Wheat fiour. 

d. Barley, Jowar, ({ram etc. to be specified. 

4. Pulses to l)e specified, 

f). Oil ( ) 

(i. Ghee. 

7. Salt and spices. 

5. Vegetables. 

9. Fruits. 

10. Meat and Fish 

11. Sugar and ffur. 
l:j. Tea. 

14. Others. 

D. EXPENDITURE ON FUEL AND LIGHT. (MONTHLY) 

Quantify and Cost of: 

1. (Joal, or cow-<lung. 

2. Charcoal, 
d. Firewood. 

4. Kerosene oil. 
o. Castor oil. 

(i. Others. 

E. EXPENDITURE ON RENT. (MONTHLY) 

J. Is the house own or taken on i*ent? 

2. If taken on rent, the amount paid as rent. 

2. If own, 

(a) Cost of j'epairs paid to labourers and suppliers of 

materials. 

(b) Was family labour used' If yes, to what extent? 

(c) Cost of white-washing. 

(d) When was the house constructed; cost of construc- 

tion; life of the house? 



468 


statistics: tiikorv and practice 


F. EXPENDITURE ON CBLOTHING AND FOOTWEAR. 

(MONTHLY) 

For each article under this head answer 

1. number of articles purchased. 

2. the period they are estimated to last. 

8. total cost incurred when purchased. 

4. estimated cost per month. 

For Men: — 

1 . Dhoiies, 

2. Pyjamas, 

3. Kwrtas, 

4. Shirts. 

5. Pagrt, turbans or caps. 

6. Coats and waistcoats. 

7. Sherwanis, 

8. Mirzais, or baaidis, 

9. Dhusa or JLohu 

10. Angocha, or handkerchief. 

11. Socks. 

12. Shoes. 

For Women: — 

1. Lahanga or Sari, 

2. Phariya, UrhnL 

3. Kuril, 

4. Bodice. 

5. Petti-coat. 

6. Chadar or Burqa, 

For Children : — 

1 . Dhoties, 

2. Saries, 

3. Kurta, 



APPEXDIK 


469 


4. Cap*^. 

5. Shirt. 

6. Bodice. 

7. Pyjama or l^hanga 

8. Shoes. 

9. Angocha. 

G. EXPENDITURE ON HOUSE-HOLD REQUISITES. 

(MONTHLY) 

lincler this head also answer for each article 

1. the number purchased. 

2. the period they are estimated to last. 

1). the total cost incurured when purchased. 

4. estimated cost i>er month. 

For l>cdding puri>oses: — 

1 . Charpaie^. 

2. Bedding proper : 

Dari, Chadar, gadda. lihaf, pillows, pillow cases, 
blankets. Rajah. 

(,-arpets for floor. 

h\)i* utensils - 

1. ThaliSy Par at. 

2. Deg, Batua, Pat H i. 

2. Karhai, Tava. 

4. Chamcha, ckimta. 

5. Others. 

H. EXPENDITURE ON MISCELLANEOUS ITEBSS. 

(MONTHLY) 

1. Amount paid to barber. 

^ 2. , ,, „ „ washerman, 

2. ,, „ „ sweeper. 



470 


statistics: theory and practice 


4 . 

5 . 

6 . 

7 . 

8 . 
9 . 

10 . 

11 . 

12 . 

13 . 

14 . 

15 . 

16 . 

.standing 


Amount paid to village purohit or mulla. 

,, „ „ for religious functions. 

, ., „ „ medical fees and medicine, 

., „ ,, education. 

„ „ 1 ravelling by rail and road. 

„ ,, conventional necessities; 

Pan Supari, Jihaiip, tobacco, etc. 

Interest on debt. 

Repayment of debt. 

Payment to married daughtcj-. 

Payment to depe.ndents not living in the village. 
Expenditure on amusements. 

litigation. 

Any other expenditure c.//. on jewellery or ornamcnt.s, 
letters, Tika- hindi etc. 


ABSTRACT OF EXPENDITURE. 

Family Income. 

Family Expenditure : — 

Food. 

Fuel and light. 

Rent 

Clothing and footwear. 

Household requisites. 

Miscellaneous. 

Balance of Income over expenditure 

(+ or -) 



APPENDIX II. 


LIST OF IMPORTANT STATISTICAL PUBLICATIONS. 

(A) Indian 

I. Publi<jations of the Department of Commercial Intelligence 
and Statistics. 

1. Jiidiaii Trade Journal (Weekly). 

2. Accounts relating to the Sea-borne Trade and Navi- 

gation of British India (Monhly). 

3. Monthly Statistics of Cotton Spinning and Weaving 

in Indian Mills. 

4. Monthly Statistics of the Production of Certain 

Selected Industries of India. 

5. Accounts relating to the Inland (Rail and River- 

borne) Trade of India (Monthly). 

6. Monthly Statement of wholesale prices of certain 

selected articles at various centres in India. 

7. Accounts relating to the Sea-borne Trade and 

Navigation of British India (Annual), 
j S. Statistical A])stract for British India (.\nnual). 

9. Agricultural Statistics of India : — 

Vol I — British India (Annual). 

Vol. II — Indian States (Annual). 

10. Estimates of Area and Yield of Principal Crops in 

India (Annual). 

11. Indian Tea, Coal, Rubber and Coffee Statistics 

(published separately) (Annual) . 

12. Joint Stock Companies in British India and in some 

Indian States (Annual). 

13. Statistical Tables relating to Banks in India 

(Annual). 


471 



472 


statistics: theory^ and practice 


14. Statistical Tables relating: to the Co-operativ3 

Movements in India (Annual). 

15. Review of the Trade of India (Annual). 

16. Large Industrial Establishments in India (Bieniiiai). 
j 17. Live-stock Statistics, India (Quinquennial). 

18. Quinquennial Repoit on the average yield per acre 

of principal crops in India. 

19. Crop Forecasts of Rice, Wheat, Cotton, Linseed, 

Sesamum, Groundnut, Sugarcane, Castorseed 
(l>eriodically), (xXLso published in the Indian 
Trade Journal). 

20. Crop Atlas of India. 

II. Reports of Committees and Commissions. 

1. Datta’s Report on the Rise of Prices in India (1912). 

2. Report of the Economic Enquiry Connnittee (1925). 

3. Report of the Royal (Commission on Indian Agri- 

culture (1928). 

4. Report of the Taxation Inquiry (committee. 

5. IndiLstrial Commission Report. 

6. Report of the Royal Commission of Indian Labour. 

7. Banking Inquiry Committee Reports (Central and 

Provincial). 

8. Reports of the Committees and (Commissions on 

Indian Currency and Exchange. 

9. Industrial Surveys in various districts of II. P. 

10. Labour, Unemployment, and Textile Enquiry (Com- 

mittee Reports (Provincial). 

11. Tariff Board Reports. 

12. Report of Rowley-Robertson Committee. 

in. Otber Ckivcnmieiit Publioatxons. 

1. Gazette of India (Weekly). 

2. Provincial Gazettes (Weekly). 

3. Labour Gazette, Bombay (Monthly). 



APPENDIX 


473 


4. Central and Provincial Cov'crnments’ Budgets 
(Annual). 

Administration Keports of Provincial (lovernments 
(Annual) . 

(). Administration lleport of Railways in India 
(Annual). 

7. Report of the Controller of Currency (Annual). 

8. Census Reports (for India, Provinces and Native 

States) (Becennial). 

9. Working Class Family Budgets. 

10. Monthly Survey of Business Conditions in India. 

11. fiuide to Current Official Statistics. 

12. Indian Labour Gazette. 

TV. Ncm-official Publi'cations. 

1. Saiikhya (Journal of the Indian Statistical Insti- 

tute (Calcutta). 

2. Capital (Calcutta) (Weekly). 

8. Indian Jounml of Economics, (Allahabad). 

4. Commerce (Calcutta) (Weekly). 

0. Indian \ear-Book (Times of India, Bombay) 

(Annual). 

6. Wealth of India, by Wadia and Joshi. 

7. Wealth and Taxable Capacity, by Shah and 

Khambata. 

8. India's National Income, by V. K. R. V. Hao. 

9. Eastern Economist. 

(B) Ftn'eiifii 

1. Publications of Great Britain, 

1. Board of Trade Journal and Commercial Gazette 

(Weekij'). 

2. Ministry of Labour Gazette (Monthly). 

3. Journal of the Royal Statistical Society, Imndon 

(Quarterly) . 



474 


statistics: theory and practice 


4. Animal Statistical Abstract of the United Kinf>:d()iiu 

5. (hiide to Current Official Statistics of the United 

Kiii^dom. 

6. Census Reports. 

7. Census of Pi-oduction Reports. 

S. Reports of (Commission on National Debt and 
Taxation. 

9. London and (Ciimbrid^e Economic Survey. 

II. Publications of the League of Nations, Geneva. 

1. International Statistical Year Book (Annual). 

2 . Memorandum on (Currency and (Central Banks. 

.‘1 Memorandum on Public Finance. 

4. Methods of Compilinj>‘ Cost of Living Judex 

Numbers (1925). 

5. Methods of Conducting Family Budget Enquiries 

( 1928 ). 

(). Year Book of Laboin- Statistics (Annual). 

7. International Jjabour Review. 

III. Sonne Standard Books on Statistics. 

1. King, W. J. — li^lements of Statistical Methods. 

2. P>oddington, A. L, — Statistics and their Application 

to Commerce. 

8. (Connor, L. R. — Statistics in Theoi*y and Practice. 

4. Jones, 1). (c. — A First (Course in Statistics. 

5. Secrist, 11. — Introduction to Statistical Methods. 

(). Bowley, A. L. — Elementary Manual of Statistics. 

7. J^owley, A. L. — Elements of Statistics. 

Zizek — Statistical AATrages. 


8 . 



APPENDIX 


475 


9. Yule, G. U. — Theory of Statistics. 

10. Yule, O. U. and Kendel — An Introduction to the 

Theory of Statistics. 

11. 'Mills, F. C , — Statistical Methods applied to Econo- 

mics and Business. 

12. Elderton, W. P. — Frequency Curves and Correlation. 

12. Fisher, R. A., — Statistical Methods for Research 
Workei-s. 

14. Davis and Nelson — Elements of Statistics. 



APPENDIX III. 

MEASUSEMENT OF THE NATIONAL INCOME OF INDIA 


SUMMARY OF THE SCHEME RE(X)MMENI)ED RV 
THE HOWLEY-ROBERTSON (M)MMITTEE. 

J)r. A. li. Bowley and Mr. I). 11. Kobrrtsoii were invited 
by the tJovernment of India to consider, inter alia, the materials 
available for estimating the national income and wealth of 
India. They submitted their repoil entitled ‘A Scheme for 
an. Economic Census of India’ in 1934, wherein, stating that 
these materials w^ere very defective, they ])ut forward ceUain 
practical proposals for estimating the total national income of 
India. 

‘The national income,* according to the committ(‘e. * is the 
money measure of the aggregates of goods and services 
accruing to the inhabitants of a country during a year, inclinb 
ing net decrements from, their individual or collective wealth.' 

Two methods of wilculation have been pointed out : the 
first comprizing an evaluation of the go(Kls and services 
accruing, and the second a summation of individual incomes. 
The first is the census of products method and the second is 
the census of incomes method. The first method is unlikely to 
be ever applicable over even the whole, of the industrial field 
in India. Special caution in combining the results of the two 
methods may be necesary. 

The census of products method involves— 

(i) evaluating the net output of agriculture, jiiining, 
industry, and other productive enteiprises at the 
point of production. Precaution is necessaiy to 
avoid double counting {e.g. counting Imtli the 
output of wheat and the labour of the. cattle em- 
ployed in raising it) ; 

476 



APPENDIX 


477 


(ii) Jiddiiig- tli-o value added by transporting and 

meji'chanting agencies in the country to home- 
produced goods and to imports; 

(iii) adding excises on home-produced goods aiul 

customs duties on imports; 

(iv) adding the value of imports (c.i.f.), including gold 

and silver; 

(v) deducting the value of exports (f.o.b.), including 

gold and silver. 

(vi) deducting the value of goods, home-produced or 

imported, which are used for maintaining fixed 
capital, or stocks of raw and finished goods, 
intact ; 

(vii) adding the value of ]>ei*sonal services of all kinds; 

(viii) adding the yearly r^ental value of houses, 
Avhetlier rented or occupied by the owners. 

(ix) adding the increments in the holdings of balances 
and securities abroad by individuals or by 
goveniment, or deducting the decrement in such 
holdings; similarly, deducting the increment in 
such holdings in the country by residents abroad, 
or adding their decrement. 

The method described above is the more fundamental of 
the two methods of evaluating the National Income. Certain 
precautions in following the census of incomes method must be 
observed in order that the results arrived through it may tally 
with those obtained by the first method. 

(i) All self-consumed produce and receipts in kind 
. must be included in the individuaFs income 

valued at their selling price at the place of pro- 
duction. So must be the yearly value of houses 
occupied by the owners. 

(ii) All interest payments must be deducted before 

writing down individual’s income. 



478 


statistics; iTIKOlIV AM) PKACTICK 


(Hi) The incumes of all individuals in (lie eoimtn 
should be entered gross, i.e. before payment of 
direct taxation. To this total should be added 
the undistributed profits of companies and tin- 
net profits of (}overiiirient enterprises. Froio 
this total should be subtracted the interest on 
(lovernnient loans other than for productive 
enterprises and the pensions of all ex-(iovern> 
nient servants. 

(iv) To the total so far reached should be added 
receipts from customs and excise, stamp duty and 
local rates. 

The suggestions which Dr. Bowley and Mr. Robertson 
make (stated belowj relate to the estimate, of the broad 
sections of National Income: the. various adjustments indicat- 
ed above would have their place in final calculations ot' the 
National Income. 

The investigation they propose for estimating the national 
income is primarily on the basis of production but a minoi* 
])art depends on individual incomes. The proportion to b»‘ 
thus estimated is probably greater in the towns in India. They 
say, partly owing to difference in nature of the products 
and partly because different methods of investigation ari‘ 
necessary, i-ural income is distinguished from urban income.” 

For rural income they recommend an estimate of tlie 
((iiantity and value of all goods and services arising from tlH‘ 
land or rendered in the village, by the method of intensive 
surveys in selected villages. 

Foi’ urban income they suggest, in the first instance, sur- 
veys of the larger towns based on a sample enquiry of the 
personnel and occupations of families, an estimate of their 
incomes by personal statements and by investigation of wag:‘s 
and salaries prevailing in the town- For incomes over 
Rs. 1,000 or at least over Rs. 2,000. income-tax statistics can 
of much help. 



APPENDIX 


479 


They have also reeoinineiided nil inteniiediate rrhaii 
Population Census. 

These three enquiries would b(‘ supplemented by n 
Census of Production applied to faetorie's usin<> power, mines 
and some other industries. 


KURAL SURVEYS. 

The method advoeated for select inj 4 the villajies tor the 
purpose of intdisive survey is that of random sampling*. It 
consists of makinj* a list of all the villages in a province in 
fjceographical order of districts, and, aftei* deciding- the 
number to be inv(sti»ated, mai*kin<>- out, start in<i- from some 
random numbei', the required numbei* of villajic all nearly 
equally spaced. Thus every unit in the aji^re^ate will have 
an equal chance of bein<» taken in. Wlun a \'illa^e has lieen 
once select (‘d it should on no account be substituted )>y 
anothei*. 


The I'eport ^ives the following table which sh(»ws the 


each province: 

Ib-ovince 

Pen gal 

Pihar and t )rissa 
P>oml)ay 

Central Provinces 

Madras 

Punjab 

L’nited Provinces 


of Bengal, areas of Bengal where coalmining is important, 
and the areas affected by earthquake and not completely re- 
settled by the time of the enquiry. 


K'r of villages to he 

invistigated in 

Number of Villagis 

Number in 

in Province 

Sample* 

H6.00U 

•250 

83,000 

300 

21,000 

200 

40.000 

200 

51,000 

200 

35,000 

200 

106,000 

300 

ioi- Hritish ludi;i 

some estimates 

1 , N. AV. F. Province, 

tea plantations 



4^0 statistics: theory and practice 

The investigator should be trained and live in each village 
for 12 mbnths. In may cases the villages could be grouped 
in threes or fours, say 30 miles of each other. To each of 
such groups a superior investigator would be attached, who 
would live in the largest village and do supervision work. 
Each province should be under the charge of a qualified 
statistician ; and the entire survey should be controlled by the 
Director of Statistics, whose appointment the committee re- 
commend as part of the Permanent Staff. The necessary' 
schedules should be prepared by the Director in consultation 
with Provincial Statisticians. They should be adapted to 
local conditions, and local terms of weights, measures etc. 
should be used- The main enquiiy would, no doubt, be 
directed to income, productoin, consumption and allied topics, 
yet the investigator's would have ample time to report on 
subjects like health, cooperation, debt etc. 

URBAN SURVEYS. 

Random sample of towns is not recommended. The pro- 
blem is to be dealt with step by step, fii'st by a sjuichronous 
suiwey of those cities in which Universities can organise 
satisfactory investigation, secondly by making similar, though 
not so intensive, surveys of other towns. After the Rural 
Surveys and the University City Siuveys are completed, effi- 
cient inveMigators should be engaged to survey selected towns. 

Univensitjy City Surveys. 

In the organisation of these surveys central control 
should be combined with local autonomy. A central com- 
mittee should be set up to draw up an outline schedule of 
enquiry, to Jidvise on any points referred to it and to present 
a report on the whole subject in the end. 

If the surveys fall to Government Colleges, cooperation 
of Director of Public Instruction and the Education Depart- 
ment would be necessary. If they fall to self-governing Uni- 



APPENDIX 


481 


vcrsilies, arrangejivents would be made with the Economics 
Department of the University concerned. City survey should 
be directly carried on by one of the Economics staff. The 
detailed investigations should be carried out by graduates or 
postgraduates reading Economics. 

There are two methods of approach: 1. Occupational 
2. By families. 

1. An occupational census is almost essential. In each 
industry and important occupation in the town, enquiries 
should be made regarding current rates of earnings and 
wages, estimated over the year and allowing for seasonal 
variations. It would include not only those employed in 
constructive Industries, but also clerics, municipal and rail- 
way employees, tonga-drivers and all others working for 
salaiies or wages or making small profits. The method of 
payment (piece or time) may also be recorded. 

2. An accurate list of houses or tenements is necessary- 
Big towns, say of more than 150,000 persons, may better be 
divided into wards or groups of wards, so that an unit may 
consist of 30,000 houses. About 1000 houses may be selected 
in each unit on random bases, and visited by investigator. 
House once selected should not be substituted by another. 

The visitor should establish friendly relations with the 
residents. Thereby he would be able to obtain reliable infor- 
mation about numbers; sex, age and occupation of the family 
group. Repeated visits may be necessary. Schedules should 
be filled in immediately after and not during the visit. 

The totals should, in case of doubt, be given as within a 
certain range. All existing data relating to the subject of 
the survey emanating from Central and Local authorities, 
trade organisations etc. should be studied. Cooperation of these 
ofatial and non-official organisations should be sought- 

P.— 31 



482 


statistics: theory and practice 


CENSUS OF PRODUCTION. 

This census would be imposed by a special Act of the 
Central Legislature, making the communication of facts de- 
nianded compulsory. It would be conducted by the Director 
of Statistics. Industries employing 20 or more pei*sons and 
using mechanical power, some small workshops, certain indus- 
tries where mechanical power is not used such as brick-making 
and Carpet manufacture, railways and all establishments 
under the Mines Act would be covered. 

Progress of factory industry is, to some extent, at the 
cost of cottage industry. Therefore, it would be good if the 
two could be brought in relation to each other. If some 
yearly data regarding them could be procured, an idea of 
their relative increase or decrease would be available. The 
necessary facts to be collected are the aggregate value of the 
sales and the aggregate cost of materials for each factory. 
The difference approximately shows the national income 
accruing to the factory, and when all factories are considered, 
the aggre^gate difference minus depreciation of plant and 
change ip value of materials and finished goods would be a 
measure of the contribution to the national income of the 
industry. 

The classification of products should be the same as that 
of exports and imports. The -employees should be classified as 
salaried persons and wage-earners, young and adult with a 
statement of the age division between the two sexes- Besides, 
details can be obtained of the amounts and values of different 
commodities produced, and of materials bought, and of 
power used. 

The opimsition, objections, and difficulties which the in- 
vestigator will encounter will be great, but with thej periodic 
repetition of the census they would automatically decrease. 



APPENDIX IV. 


LOGARITHM. 

Logarithm of a given number to the base ten is the power 
to which the base ten should be raised to equal the given 
number. 


10000 

= 10 ^ 

Logarithm 

of 

10000 = 

4 

1000 

= 10 => 

• * 


1000 = 

3 

100 

= 10 “ 



100 = 

2 

10 

= 10 ' 

• • 


10 = 

1 

1 

II 

o 

e 

• • 


1 

0 

.1 

= 10 -' 

• • 


.1 

-1 

.01 

= 10-2 

• • 

>> 

.01 = 

-2 

.001 

= 10-2 

• * V 


.001 = 

-3 

.0001 

= 10 -' 

• * V 


.0001 = 

-4 


From the above it will be seen that the Log of 1 is 0 and 
Log of 10 is 1. Therefore, Log of any number between 1 and 
10 would be greater than zero but less than 1. That is, it 
would be equal to 0 + a fraction- Similarly. Log of any 
number between 10 and 100 is 1 + a fraction. Thus a 
logarithm may consist of two parts: the whole number, and 
the fraction. The whole number part, e.g., 0 or 1 in the above 
instances, is termed cbaracteristic and the fractional pari is 
termed Mantissa. 

To determine the characteristic of any given number, 
the following two rules should be noted: 

(1) If the given number is greater than one,, the charac- 
teristic is always positive, and is obtained by the formula 
(n— 1), where n stands for the number of significant digits 
before the decimal point. 

(2) If the given number is less, than one, th-e characteris- 
tic is always negative, and is obtained by the formula (xV+l), 

483 



484 statistics: theory and practice 

where N stands for the number of zeroes aftw the decimal 
point but before any significant digit. The minus sign of the 
negative characteristic is written at the top of the charac- 
teristic and not prefixed to it. Thus, the characteristic of 
minus two is written as 2 and not as —2. 

In accordance with the above two rules characteristics ol 


a few numbers are given below : 


Number 

Value of M (rule 1) 

Characteristic 

4539 

4 

3 

453.9 ’ 

3 

2 

45-39 

2 

1 

4.539 

1 

0 

Number 

Value of -Y (rule 1) 

Characteristic 

.4539 

0 

T 

.04539 

1 

2 

.004539 

2 

3 

.0004539 

3 

4 


Characteristic for any number can be similarly calculated. 

For calculating mantissa for different numbers, Logarith- 
mic table has to be consulted. Logarithmic table giving the 
mantissa for any number having less than four 
digits is given at the end of this appendix. 
Mantissa of the required number should be i*ead 
out from this table irrespective of the position of the decimal. 
If the given number is comimsed of four or more digits, it 
must first be approximated to 3 digits* Then its mantissa 
should be read out from thei table. If we are to find the 
Logarithm of 4539, we first approximate it to three digits. 
The approximation is 454. The characteristic of 4539 is 3, and 
the mantissa of 454 is .6571. Therefore^, Logarithm of 4539 
is 3.6571. 

Two facts about mantissa should bei clearly noted. 
Firstly, it is always positive irrespective of the fact whether 
the characteristic is negative or positive. Secondly, mantissa 
is not affected by the position of the decimal point in the 



APPENDIX 485 

number. The mantissa of 454, 45.4, 4.54, .454, .0454 or 
45400000 is the same. Only tie characteristic will differ. 
Thus, to find the logarithms of the numbers whose character- 
istics are given above, .6571 would be attached to the 
characteristics already ascertained. The logarithm of 4.539, 
for instance, would be 0.6571, and that of .004539 will be 
3.6571. Lqgarithm of any given number can be similarly 
calculated* In adding up a series of logarithms in which some 
characteristics are negative and some positive, the mantissa 
of all the logs should be taken as positive, and characteristics 
should be treated according to their algebraic signs. 

Antilog. 

Antilog of any given number is a required number 
logarithm of which is the given number. An 
antilog table is also given at the end of this 
appendix. With its help antilog of any number can be deter- 
mined. The mantissa of the given number will give the 
different digits of the required number and characteristic will 
enable locate the decimal point in it. Suppose we want 
to find the antilog of 2.1563. AVe approximate *1563 to three 
digits and it comes to .156. The antilog of .156 is 1432. As 
the characteristic is plus, according to rule 1, there must be 
three significant digits before the decimal point. Therefore, 
the required numl>er is 143.2. Similarly, to find the antilug 
of 2.1563, we ascertain the antilog of .156 which is 1432, as 
in the former case. But since the characteristic is minus, 
according to rule 2, there must be one zero after the decimal 
point, so that the required number is .01432. Antilog of any 
number can be similarly calculated. 

Uses of Loigaiithms. 

The logarithm of the product of any two numbers is the 
sum of the logs of the two numbers taken separately and 
therefore, when two or more numbers are to be multiplied the 



486 


statistics: theory and practice 


logarithm of each number is found out and added. The anti- 
log of the sum is the required product. 

Log (aX b) = Log a + Log b 
aXb = Aniilog [Log a + Log 6] 

The logarithm of any number a divided by b is the 
difference of the logs of a and b. Therefore, when one 
number is divided by another, the logs of both the numbers 
are found out and the antilog of the diflfei-ence between the 
two logs gives the desired quotient. 

Log^ =Log « — Log i> 

® = Antilog [Log a — Log b] 
b 

The logarithm ox any number raised to n*** powei* is 
n times the log of that number. Therefore, when any given 
number is raised to any power, the logarithm of the given 
number is found and multiplied by the i)ower to which the 
number has been raised. The antilog of the product gives the 
value of the given number raised to the desired power. 

Log o" = n Log a 

a” = Antilog [u Log a] 

The log of a given number to the root is e,qual to the 
log of that number divided by n. Therefore, when the value 
of any given number to any given root is desired to be obtain- 
ed, the log of the given number is found out. This log is 
divided by the given root to which the given number' is to be 
reduced. The antilog of the quotient gives the value of the 
given number reduced to the desired root. 

Log n^Ja = Log ^ 
n 

.'. n\/a = Antilog 



MATHEMATICAL TABLES 


INSTEUCTIONS. 


'Eable of Lo^^thms — This table gives the mmtifta of 
figures. To find the vmntma of any given number from the 
table, the number should be approximated to three digits. 
Mmtma of a number is the same regardless of the position 
of the decimal point in it. 

Table of Antilogarithms — This table gives the anti- 
logarithms of the mantissa portion of any given logarithm. The 
position of the decimal point in the required figure should be 
determined on th" ba.sis of the characteristic of the given 
logaiithm. 

Table of Squares — In this table upto 316 one zero, and 
fi'om 317 onwards two zeroes, are omitted in each square. If 
in the given figure the decimal point moves by one digit to the 
left, then the decimal point moves by two digits to the 
left in the square. 

Table of Square Roots — This table gives two square roots 
for each nuiuber. For add dU/Us in the given number, the 
upper figure, and for even digits, the lower figure should be 
taken. If in the given number the decimal point moves by 
iieo digits to the left, then the decimal point moves by one 
digit to the left in the square root. 

Table of Reciprocals — If in the given number the decimal 
point moves by one digit to the right, then the decimal point 
moves by one digit to the left in the reciprocal. 



488 


LOOABITHMS 


I 012S456789 

10 •0000' 0043 0086 0128 0170 0212 0253 0294 0334 0374 

11 '0414 0453 0492 0531 0569 0607 0645 0682 0719 0755 

12 ‘0792 0828 0864 0899 0934 0969 1004 1038 1072 1106 

13 •1139 1173 1206 1239 1271 1303 1335 1367 1399 1430 

14 •1461 1492 1523 1553 1584 1614 1644 1673 1703 1732 

15 -1761 1790 1818 1847 1875 1903 1931 1959 1987 2014 

16 -2041 2068 2095 2122 2148 2175 2201 2227 *2253 2279 

17 -2304 2330 2355 2380 2405 2430 2455 2480 2504 2529 

18 ‘2553 2577 2601 2625 2648 2672 2695 2718 2742 2765 

19 '2788 2810 2833 2856 2878 2900 2923 2945 2967 2989 

20 '3010 3032 3054 3075 3096 3118 3139 3160 3181 3201 

21 -3222 3243 3263 3284 3304 3324 3345 3365 3385 3404 

22 ‘3424 3444 3464 3483 3502 3522 3541 3560 3579 3598 

23 •3617 3636 3655 3674 8692 3711 3729 3747 3766 3784 

24 '3802 3820 3838 3856 3874 3892 3909 3927 3945 3962 

25 -3979 3997 4014 4031 4048 4065 4082 4099 4116 4133 

26 ^4150 4166 4183 4200 4216 4232 4249 4265 4281 4298 

27 ‘4314 4330 4346 4362 4378 4393 4409 4425 4440 4456 

28 -4472 4487 4502 4518 4533 4548 4564 4579 4594 4609 

29 '4624 4639 4654 4fcD9 4683 4698 47a3 4728 4742 4757 

30 •4771 4786 4800 4814 4829 4843 4857 4871 4886 4900 

31 •lOll 4928 4942 4955 4969 4983 4997 5011 5024 5038 

32 •5051 5065 5079 5092 5105 5119 5132 5145 5159 5172 

33 -5185 5198 5211 5224 5237 5250 5263 5276 5289 5302 

:34 •5315 5328 5340 5353 5366 5378 5391 5403 5416 5428 

35 '5441 5453 5465 5478 5490 5502 5514 5527 5539 5551 

36 ^5563 5575 5587 5599 5611 5623 5635 5647 5658 5670 

37 -5682 5694 5705 5717 5729 5740 5752 5763 5775 5786 

38 -5798 5809 5821 5832 5843 5855 5866 5877 5888 5899 

39 -5911 5922 5933 5944 5955 5966 5977 5988 5999 6010 

40 -6021 6031 6042 6053 6064 6075 6085 6096 6107 6117 

41 •6128 6138 6149 6160 6170 6180 6191 6201 6212 6222 




LOOABITHMS 


489 



0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

42 

•6232 

6243 

6253 

6263 

6274 

6284 

6294 

6304 6314 6325 

43 

•6335 

6345 

6355 

6365 

6375 

6385 

6395 

6405 

6415 

6425 

44 

•6435 

6444 

6454 

6464 

6474 

6484 

6493 

6503 

6513 

6522 

45 

•6532 

6542 

6551 

6561 

6571 

6580 

6590 

6599 

6609 

6618 

46 

•6628 

6637 

6646 

6656 

6665 

6675 

6684 

6693 

6702 

6712 

47 

•6721 

6730 

6739 

6749 

6758 

6767 

6776 

6785 

6794 

6803 

48 

•6812 

6821 

6830 

6839 

6848 

6857 

6»66 

6875 

6884 

6893 

49 

•6902 

6911 

6920 

6928 

6937 

6946 

6955 

6964 

6972 

6981 

50 

•6990 

6998 

7007 

7016 

7024 

7033 

7042 

7050 

7059 

7067 

51 

•7076 

7084 

7093 

7101 

7110 

7118 

7126 

7135 

7143 

7152 

52 

•7160 

7168 

7177 

7185 

7193 

7202 

7210 

7218 

7226 

7235 

53 

•7243 

7251 

7259 

7267 

7275 

7284 

7292 

7300 

7308 

7316 

54 

•7324 

7332 

7340 

7348 

7356 

7364 

7372 

7380 

7388 

7396 

55 

•7404 

7412 

7419 

7427 

7435 

7443 

7451 

7459 

7466 

7474 

56 

•7482 

7490 

7497 

7505 

7513 

7520 

7528 

7536 

7543 

7551 

57 

•7559 

7566 

7574 

7582 

7589 

7597 

7604 

7612 

7619 

7627 

58 

•7634 

7642 

7649 

7657 

"664 

7672 

7679 

7686 

7694 

7701 

59 

•7709 

7716 

7723 

7731 

7738 

7745 

7752 

7760 

7767 

7774 

60 

•7782 

7789 

7796 

7803 

7810 

7818 

7825 

7832 

7839 

7846 

61 

•7853 

7860 

7868 

7875 

7882 

7889 

7896 

7903 

7910 

7917 

62 

•7924 

7931 

7938 

7945 

7952 

7959 

7966 

7973 

7980 

7987 

63 

•7993 

8000 - 

8007 

8014 

8021 

8028 

8035 

8041 

8048 

8055 

64 

•8062 

8069 

8075 

8082 

8089 

8096 

8102 

8109 

8116 

8122 

65 

•8129 

8136 

8142 

8149 

8156 

8162 

8169 

8176 

8182 

8189 

66 

•8195 

8202 

8209 

8215 

8222 

8228 

8235 

8241 

8248 

8254 

67 

•8261 

8267 

8274 

8280 

8287 

8293 

8299 

8306 

8312 

8319 

68 

•8325 

8331 

8338 

8344 

8351 

8357 

8363 

8370 

8376 

8382 

69 

•8388 

8395 

8401 

8407 

8414 

8420 

8426 

8432 

8439 

8445 

70 

•8451 

8457 

8463 

8470 

8476 

8482 

8488 

8494 

8500 

8506 

71 

•8513 

8519 

8525 

8531 

8537 

8543 

8549 

8555 

8561 

8567 

72 

•8573 

8579 

8585 

8591 

8597 

8603 

8609 

8615 

8621 

8627 

73 

•8633 

8639 

8645 

8651 

8657 

8663 

8669 

8675 

8681 

868 & 

74 

•8692 

8698 

8704 

8710 

8716 8722 8727 

8733 

8739 

8745 




490 


LOOABITHMS 


0123456789 

75 'HTSl 8756 8762 8768 8774 8779 8785 8791 8797 8802 

76 '8808 8814 8820 8825 8831 8837 8842 8848 8854 8859 

77 '8865 8871 8876 8882 8887 8893 8899 8904 8910 8915 

78 ‘8921 8927 8932 8938 8943 8949 8954 8960 8965 8971 

79 "8976 8982 8987 8993 8998 9004 9009 9015 9020 9025 

80 -9031 9036 9042 9047 9053 9058 9063 9069 9074 9079 

81 '9085 9090 9096 9101 9106 9112 9117 9122 9128 9133 

82 ‘9138 9143 9149 9154 9159 9165 9170 9175 9180 9186 

83 "OlOl 9196 9201 9206 9212 9217 9222 9227 9232 9238 

84 ‘9243 9248 9253 9258 9263 9269 9274 9279 9284 9289 

85 -9294 9299 9304 9309 9315 9320 9325 9330 9335 9340 

86 ’9345 9^50 9355 9360 9365 ^370 9375 9380 9385 9390 

87 -9395 9400 9405 9410 9415 9420 9425 9430 9435 9440 

88 -9445 9450 9455 9460 9465 9469 9474 9479 9484 9489 

89 ’9494 9499 9504 9509 9513 9518 9523 9528 9533 9538 

90 -9542 9547 9552 9557 9562 9566 9571 9576 9581 9586 

91 -9590 9595 9600 9605 9609 9614 9619 9621 9628 9633 

92 -9638 9643 9647 9652 9657 9661 9666 9671 9675 9680 

93 '9685 9689 9694 9699 9703 9708 9713 9717 9722 9727 

94 '9731 9736 9741 9745 9750 9754 9759 9763 9768 9773 

95 ’9777 9782 9'"S6 9791 9795 9800 9805 9809 9814 9818 

96 -9823 9827 9832 9836 9841 9845 9850 9854 9859 9863 

97 '9868 9872 9877 9881 9886 9890 9894 9899 9903 9908 

98 •99’.2 9917 9921 9926 9930 9934 9939 9943 9948 9952 

99 ‘9956 9961 9965 9969 9974 9978 9983 9987 9991 9996 



AMTI-LOaABlTHlIS 


491 



0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

00 

1000 

1002 

1005 

1007 

1009 

1012 

1014 

1016 

1019 

1021 

•01 

1023 

1026 

1028 

1030 

1033 

1035 

1038 

1040 

1042 

1045 

•02 

1047 

1050 

1052 

1054 

1057 

1059 

1062 

1064 

1067 

1069 

•03 

1072 

1074 

1076 

1079 

1081 

1084 

1086 

1089 

1091 

1094 

•04 

1096 

1099 

1102 

1104 

1107 

1109 

1112 

1114 

1117 

1119 

05 

1122 

1125 

1127 

1130 

1132 

1135 

1138 

1140 

1143 

1146 

*06 

1148 

1151 

1153 

11.56 

1159 

1161 

1164 

1167 

1169 

1172 

•07 

1175 

1178 

1180 

1183 

1186 

1189 

1191 

1194 

1197 

1199 

•08 

1202 

1205 

1208 

1211 

1213 

1216 

1-219 

1222 

1225 

1-227 

•09 

1230 

1233 

1236 

1239 

1242 

1-245 

1-247 

1250 

1253 

1256 

10 

1259 

1262 

1265 

1268 

1271 

1274 

1276 

1279 

1282 

1285 

•11 

1288 

1291 

1294 

1297 

1300 

1303 

1306 

1.309 

1312 

1315 

•]2 

1318 

1321 

1324 

1327 

1330 

1334 

1337 

1340 

1.343 

1346 

13 

1349 

1.352 

1355 

1358 

1361 

1365 

1368 

1371 

1374 

1377 

14 

1380 

1384 

1387 

1390 

1393 

1396 

1400 

1403 

1406 

1409 

15 

1413 

1416 

1419 

1422 

1426 

1429 

1432 

1435 

1439 

1442 

•16 

1445 

1449 

1452 

1455 

1459 

1462 

1466 

1469 

1472 

1476 

•17 

1479 

1483 

1486 

1489 

1493 

1496 

1.500 

1503 

1507 

1510 

•18 

1514 

1517 

1.521 

1524 

1.5-28 

1.531 

1,5.35 

1538 

1542 

1545 

19 

1549 

1.552 

1 556 

1.560 

1563 

1.567 

1.570 

1.574 

1578 

1581 

•20 

1585 

1589 

1592 

1.596 

1600 

1603 

1607 

1611 

1614 

1618 

•21 

1622 

1626 

1629 

1633 

1637 

1641 

1044 

1648 

1652 

1656 

•22 

1660 

1663 

1567 

1671 

1675 

1679 

1683 

1687 

1690 

1694 

•23 

1698 

1702 

1706 

1710 

1714 

1718 

1722 

17-26 

1730 

1734 

•24 

1738 

1742 

1746 

1750 

17.54 

1758 

1762 

1766 

1770 

1774 

•25 

1778 

1782 

1786 

1791 

1795 

1799 

1803 

1807 

1811 

1816 

•26 

1820 

1824 

1828 

1832 

1837 

1841 

1845 

1849 

1854 

1858 

•27 

1862 

1866 

1871 

1875 

1879 

1884 

1888 

1892 

1897 

1901 

•28 

1905 

1910 

1914 

1919 

19-23 

1928 

1932 

1936 

1941 

1945 

•29 

1950 

1954 

1959 

1963 

1968 

1972 

1977 

1982 

1986 

1991 

•30 

1995 

2000 

2004 

•2009 

2014 

•2018 

2023 

2028 

2032 

2037 

•31 

2042 

2046 

2051 

2056 

-2061 

2065 

2070 

•2075 

2080 

2084 

•32 

2089 

2094 

2099 

2104 

2109 

•2113 

2118 

2123 

2128 

2133 

•33 

2138. 

2143 

2148 

21.53 

2158 

2163 

2168 

2173 

2178 

218.3 



492 


ANTI-LOGABITHMS 



0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

•34 

2188 

2193 

2198 

2203 

2208 

2213 

2218 

2223 

2228 

2234 

•35 

2239 

2244 

2249 

•2254 

2259 

2265 

2270 

2275 

2280 

2286 

•36 

2291 

2296 

2301 

2307 

2312 

2317 

2323 

2328 

2333 

2339 

•37 

2344 

2350 

2355 

2360 

2366 

2371 

2377 

2382 

2388 

2393 

•38 

2399 

2404 

2410 

2415 

2421 

2427 

2432 

2438 

2443 

2449 

•39 

2455 

2460 

2466 

2472 

2477 

2483 

2489 

2195 

2500 

2506 

•40 

2512 

2518 

2523 

2529 

2535 

2541 

2547 

2553 

2559 

2564 

•41 

2570 

2576 

2582 

2588 

2594 

2600 

2606 

2612 

2618 

2624 

•42 

2630 

2636 

2642 

2649 

2655 

2661 

2667 

2673 

2679 

2 im 

•43 

2692 

2698 

2704 

2710 

2716 

2723 

2729 

2735 

2742 

2748 

•44 

2754 

2761 

2767 

2773 

2780 

2786 

2793 

2799 

2805 

2812 

*45 

2818 

2825 

2831 

2838 

2844 

2851 

2858 

2864 

2871 

2877 

•46 

2884 

2891 

2897 

2904 

2911 

2917 

2924 

2931 

2938 

2944 

•47 

2951 

2958 

2965 

2972 

297J) 

2985 

2992 

2999 

3006 

3013 

•48 

3020 

3027 

3034 

3041 

3048 

3055 

3062 

3069 

3076 

3088 

•49 

3090 

3097 

3105 

3112 

3119 

3126 

3133 

3141 

3148 

3155 

•50 

3162 

3170 

3177 

3184 

3192 

3199 

3206 

3214 

3221 

3228 

•51 

3236 

3243 

3251 

3258 

3266 

3273 

3281 

3289 

3296 

3304 

•52 

3311 

3319 

3327 

3334 

3342 

3350 

3357 

3365 

3373 

3381 

•53 

3388 

3396 

3404 

3412 

3420 

3428 

3436 

3443 

3451 

3459 

•54 

3467 

3475 

3483 

3491 

3499 

3508 

3516 

3524 

3532 

3540 

•55 

3548 

3556 

3565 

3573 

3581 

3589 

3597 

3606 

3614 

3622 

•56 

3631 

3639 

3648 

3656 

3664 

3673 

3681 

3690 

3698 

3707 

•57 

3715 

3724 

3733 

3741 

3750 

3758 

3767 

3776 

3784 

3793 

•58 

3802 

3811 

3819 

3828 

3837 

3846 

3855 

3864 

3873 

3882 

•59 

3890 

3899 

3908 

3917 

3926 

3936 

3945 

3954 

3963 

3972 

•60 

a981 

3990 

3999 

4009 

4018 

4027 

4036 

4046 

4055 

4064 

•61 

4074 

4083 

4093 

4102 

4111 

4121 

4130 

4140 

4150 

4159 

•62 

4169 

4178 

4188 

4198 

4207 

4217 

4227 

4236 

4246 

4256 

•63 

4266 

4276 

4285 

4295 

4305 

4315 

4325 

4335 

4345 

4355 

•64 

4365 

4375 

43.85 

4395 

4406 

4416 

4426 

4436 

4446 

4457 

•65 

4467 

4477 

4487 

4498 

4508 

4319 

4529 

4539 

4550 

4560 

•66 

457.1 

4581 

4592 

4603 

4613 

4624 

4634 

4645 

4656 

4667 



AimiLooi^nmits 4^ 


01 2 3 466 7 ^9 

•67 4677 4688 4699 4710 4721 4732 4742 4753 4764 4775 

•68 4786 4797 4808 4819 4831 4842 4853 4864 4875 4887 

•69 4898 4909 4920 4932 4943 4955 4966 4977 4989 5000 

•70 5012 5023 5035 5047 5058 5070 5082 5093 5105 5117 . 

•71 5129 5140 5152 5164 5176 5188 5200 5212 5224 5236 ‘ 

•72 5248 5260 5272 5284 5297 5309 5321 5333 5346 5358 ' 

•73 5370 5383 5395 5408 5420 5433 5445 5458 5470 5483 . 

•74 5495 5508 5521 5534 5546 5559 5572 5585 5598 5610 

•75 5623 5636 5649 5662 5675 5689 5702 5715 5728 5741 ^ 

•76 5754 5768 5781 5794 5808 5821 5834 5848 5861 5875 • 

•77 5888 5902 5916 5929 5943 5957 5970 5984 5998 6012 

*78 6026 6039 6053 606 'i 6081 60 t >5 6109 6124 6138 6152 

•79 6166 6180 6194 6209 6223 6237 6252 6266 6281 6295 

•80 6310 6324 6339 6353 6368 6383 6397 6412 6427 6442 

•81 6457 6471 6486 6501 6516 6531 6546 6561 6577 6592 

•82 6607 6622 6637 6653 6668 6683 6699 6714 6730 6745 

•83 6761 6776 6792 6808 6823 6839 6855 6871 6887 6902 

•84 6918 6934 6950 6966 6982 6998 7015 7031 7047 7063 

•85 7079 7096 7112 71-29 7145 7161 7178 7194 7211 7228 

•86 7244 7261 7278 7295 7311 7328 7345 7362 7379 7396 

•87 7413 7430 7447 7464 7482 7499 7516 7534 7551 7568 

•88 7586 7603 7621 7638 7356 7674 7691 7709 7727 7745 

•89 7762 7780 7798 7816 7834 7852 7870 7889 7907 7925 

•90 7943 7962 7980 7998 8017 8035 8054 8072 8091 8110 

•91 8128 8147 8166 8185 8204 8222 8241 8260 8279 8299 

•92 8318 8337 8356 8375 8395 8414 8433 8453 8472 8492 

• 93 , 8511 8531 8551 8570 8590 8610 8630 8650 8670 8690 

•94 8710 8730 8750 8770 8790 8810 8831 8851 8872 8892 

•95 8913 8933 8954 8974 8995 9016 9036 9057 9078 9099 

•96 9120 9141 9162 9183 9204 9226 9247 9268 9290 9311 

•97 9333 9354 9376 9397 9419 9441 9462 9484 9606 9528 

•98 9550 9572 9594 9616 9638 9661 9683 9705 9727 9750 

•99 9772 9795 9817 9840 9863 9886 9908 9931 9954 9977 



494 


SQUARES 



0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

1000 

1020 

1040 

1061 

1082 

1103 

1124 

1145 

1166 

1188 

1] 

1210 

1232 

1254 

1277 

1300 

1323 

1346 

1369 

1392 

1416 

12 

1440 

1464 

1488 

L513 

1538 

1563 

1588 

1613 

1638 

1664 

13 

1690 

1716 

1742 

1769 

1796 

1823 

1850 

1877 

1904 

1932 

14 

1960 

1988 

2016 

2045 

2574 

2103 

2132 

2161 

2190 

2220 

15 

2250 

2280 

2310 

2341 

2372 

2403 

2434 

2465 

2493 

2528 

16 

2560 

2592 

2624 

2657 

2690 

2723 

2756 

2789 

2822 

2856 

17 

2890 

2924 

2958 

2993 

3028 

3063 

3098 

3133 

.3168 

3204 

18 

3240 

3276 

3312 

o349 

3386 

3423 

3460 

3497 

3534 

3572 

19 

3610 

3648 

3686 

3725 

3764 

3803 

3842 

3881 

3920 

3960 

20 

4000 

4040 

4080 

4121 

4162 

4203 

4244 

4285 

4326 

4368 

21 

4410 

4452 

4494 

4531 

4580 

4623 

4666 

4709 

4752 

4796 

22 

4840 

4884 

4928 

4973 

5018 

.5063 

5108 

.5153 

5198 

5244 

23 

5290 

5336 

5382 

5429 

5476 

5523 

5570 

5617 

5664 

5712 

24 

5760 

5808 

5856 

5905 

5954 

6003 

6052 

6101 

6150 

6200 

25 

6250 

6300 

6350 

6401 

6452 

6503 

6554 

6605 

6656 

6708 

26 

6760 

6812 

6864 

6917 

6970 

7023 

7076 

7129 

7182 

7236 

27' 

7290 

7344 

7398 

7453 

7598 

7563 

7618 

7673 

7728 

7784 

28 

7840 

7896 

7952 

8009 

8066 

8123 

8180 

8237 

8294 

8352 

29 

8410 

8468 

8526 

8585 

8644 

8703 

8'/ 62 

8821 

8880 

8940 

30 

9000 

9060 

9120 

9181 

9242 

9303 

9364 

9425 

9486 

9548 

31 

9610 

9672 

9734 

9797 

9860 

9923 

9986 

1005 

1011 

1018 

32 

1024 

1030 

1037 

1043 

10.50 

10.56 

1063 

1069 

1076 

1082 

33. 

1089 

1096 

1102 

1109 

1116 

1122 

1129 

1136 

1142 

1149 

34 

1156 

1163 

1170 

1176 

1183 

1190 

1197 

1204 

1211 

1218 

35 

1225 

1232 

1239 

1246 

1253 

1260 

1267 

1274 

1282 

1289 

36 

1296 

1303 

1310 

1318 

1325 

1332 

1340 

1347 

1354 

1362 

37 

1369 

1376 

1384 

1391 

1399 

1406 

1414 

1421 

1429 

1436 

38 

1444 

1452 

1459 

1467 

1475 

1482 

1490 

1498 

1505 

1513 

39 

1521 

1529 

1537 

1544 

1552 

1.560 

1568 

1576 

1584 

1592 

40 

1600 

1608 

1616 

1624 

1632 

1640 

1648 

1656 

1665 

1673 


The position of the decima'l point must be determined according to 
the instructions given. 





SQUARES 


495 



0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

41 

1681' 

1689 

1697 

1706 

1714 

1722 

1731 

1739 

1747 

1756 . 

42 

1764 

1772 

1781 

1789 

1798 

1806 

1815 

1823 

1832 

1840 

43 

1849 

1858 

1866 

1875 

1884 

1892 

1901 

1910 

1918 

1927 

44 

1936 

1945 

1954 

1962 

1971 

1980 

1989 

1998 

2007 

2016 ■ 

45 

2025 

2034 

2043 

2052 

2061 

2070 

2079 

2088 

2098 

2107 

46 

2116 

2125 

2134 

2144 

2153 

2162 

2172 

2181 

2190 

2200 

47 

2209 

2218 

2228 

2237 

2247 

2256 

226b 

2275 

2285 

2294 

48 

2304 

2314 

2323 

2333 

2343 

2352 

2362 

2372 

2381 

2391 

49 

2401 

2411 

2421 

24.30 

2440 

2450 

2460 

2470 

2480 

2490 

50 

2500 

2510 

2520 

2530 

2540 

2550 

2560 

2570 

2581 

2591 

51 

2601 

2611 

2621 

2632 

2642 

2652 

2663 

2673 

2683 

2694 i 

52 

2704 

2714 

2725 

2735 

2746 

2756 

2767 

2777 

2788 

2798 ■ 

53 

2809 

2820 

2830 

2841 

2852 

2862 

2873 

2884 

2894 

2905 

54 

2916 

2927 

2938 

2948 

2959 

2970 

2981 

2992 

3003 

3014 : 

55 

3025 

3036 

3047 

3058 

3069 

3080 

3091 

3102 

3114 

3125 

56 

3136 

3147 

3158 

3170 

3181 

3192 

3204 

3215 

3226 

3238 

57 

3249 

3260 

3272 

3283 

3295 

3306 

3318 

3329 

3341 

3352 

58 

3364 

3376 

3387 

3399 

3411 

3422 

3434 

3446 3457 

3469 

59 

3481 

3493 

3505 

3516 

sr >28 

3540 

3552 

3564 

3576 

3588 

60 

3600 

3612 

3624 

3636 

3648 

3660 

3672 

3684 

3697 

3709 

61 

3721 

3733 

3745 

3758 

3770 

3782 

3795 

3807 

3819 

3832 

62 

3844 

3856 

3869 

3881 

3894 

3906 

3919 

3931 

3944 

3956 

63 

3969 

3982 

3994 

4007 

4020 

4032 

4045 

4058 

4070 

4083 

64 

4096 

4109 

4122 

4134 

4147 

4160 

4173 

4186 

4199 

4212 

65 

4225 

4238 

4251 

4264 

4277 

4290 

4303 

4316 

4330 

4343 

66 

4356 

4369 

4382 

4396 

4409 

4422 

4436 

4449 

4462 

4476 

67 

4489 

4502 

4516 

4529 

4543 

4556 

4570 

4583 

4597 

4610 

68 

4624 

4638 

4651 

4665 

4679 

4692 

4706 

4720 

4733 

4747 

69 

4761 

4775 

4789 

4802 

4816 

4830 

4844 

4858 

4872 

4886 

70 

4900 

4914 

4928 

4942 

4956 

4970 

4984 

4998 

5013 

5027 

71 

5041 

5055 

5069 

5084 

5098 

5112 

5127 

5141 

5155 

5170 

I 72 

5184 

5198 

5213 

5227 

5242 

5256 

5271 

5285 

5300 

5314 


The position of the decimal point must»be determined according to 
the instructions given. 




496 


SQUABES 


012 3 456 7 89 

73 5329 5344 5358 5373 5388 5402 5417 5432 5446 5461 

74 5476 5491 5506 5520 5535 5550 5565 5580 5595 5610 

75 5625 5640 5655 5670 5685 5700 5715 5730 5746 5761 

76 5776 5791 5806 5822 5837 5852 5868 5883 5898 5914 

77 5929 5944 5960 5975 5991 6006 6022 6037 6053 6068 

78 6084 6100 6115 6131 6147 6162 6178 6194 6209 6225 

79 6241 6257 6273 6288 6304 6320 6336 6352 6368 6384 

80 6400 6416 6432 6448 6464 6480 6496 6512 6529 6545 

81 6561 6577 6593 6610 6626 6642 6659 6675 6691 6708 

82 6724 6740 6757 6773 6790 6806 6823 6839 6850 6872 

83 6889 6906 6922 6939 6956 6972 6989 7006 7022 7039 

84 7056 7073 7090 7106 7123 7140 7157 7174 7191 7208 

85 7225 7242 7259 7276 7293 7310 7327 7344 7362 7379 

86 7396 7413 7430 7448 7465 7482 7500 7517 7534 7552 

87 7569 7586 7604 7621 7639 7656 7674 7691 7709 7726 

88 7744 7762 7779 7797 7815 7832 7850 7868 7885 7903 

89 , 7921 7939 7957 7974 7992 8010 8028 8046 8064 8082 

90 8100 8118 8136 8154 8172 8190 8208 8226 8245 8263 

91 8281 8299 8317 8336 8354 8372 8391 8409 8427 8446 

92 8464 8482 8501 8519 8538 8556 8575 8593 8612 8630 

93 8649 8668 8686 8705 8724 8742 8761 8780 8798 8817 

94 8836 8855 8874 8892 8911 8930 8949 8968 8987 9006 

95 9025 9044 9063 9082 9101 9120 9139 9158 9178 9197 

96 9216 9235 9254 9274 9293 9312 9332 9351 9370 9390 

97 9409 9428 9448 9467 9487 9506 9526 9545 9565 9584 

98 9604 9624 9643 9663 9683 9702 9722 9742 9761 9781 

99 9801 9821 9841 9860 9880 9900 9920 9940 9960 9980 


The poskiea of fhe decimal point must be detemined according to 
the instructions gives. 




842UABS BOOTS 497 



0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

1000 

3162 

1005 

3178 

1010 

3194 

1015 

3209 

1020 

3225 

1025 

3240 

1030 

3256 

1034 

3271 

1039 

3286 

1044 

3302 

11 

1049 

3317 

1054 

3332 

1058 

3347 

1063 

3362 

1068 

3376 

1072 

3391 

1077 

3406 

1082 

3421 

1086 

3435 

1091 

3450 

12 

1095 

3464 

1100 

3479 

1105 

3493 

1109 

3507 

1114 

3521 

1118 

3536 

1122 

3550 

1127 

3564 

1131 

3578 

1136 

3592 

13 

1140 

3606 

1 145 
3619 

1149 

3633 

1153 

3647 

1158 

3661 

1162 

3674 

1166 

3688 

1170 

3701 

1 175 
3715 

1179 

3728 

14 

1183 

3742 

1187 

3755 

1192 

3768 

1196 

3782 

1200 

3795 

1204 

3808 

1208 

3821 

1212 

3834 

1217 

3847 

1221 * 
3860 

15 

1225 

3873 

1229 

3886 

1233 

3899 

1237 

3912 

1241 

3924 

l'245 

3937 

1249 

3950 

1253 

3962 

1257 

3975 

1261 

3987 

16 

1265 

4000 

1269 

4012 

1273 

4025 

1277 

4037 

1281 

4050 

1285 

4062 

1288 

4074 

1292 

4087 

1296 

4099 

1300 
4111 ! 

17 

1304 

4123 

1308 

4135 

1311 

4147 

1315 

4159 

1319 

4171 

1323 

4183 

1327 

4195 

1330 

4207 

1334 

4219 

1338 ' 
4231 ; 

13 

1342 

4243 

1345 

4254 

1349 

4266 

1353 

4278 

1353 

42;;o 

1360 

4:301 

1364 

4:313 

1367 

4324 

i:371 

43.>6 

1375 

4347 

19 

1378 

4359 

1382 

4370 

1386 

4382 

1389 

4393 

1393 

4405 

1:396 

4416 

1400 

4427 

1404 

4438 

1407 

4450 

1411 
4461 . 

20 

1414 

4472 

1418 

4483 

1421 

4494 

1425 

4506 

1428 

4517 

1432 

4528 

1435 

4539 

1439 

4550 

1442 

4561 

1446 

4572 

21 

1449 

4583 

1453 

4593 

1456 

4604 

1459 

4615 

1463 

4626 

1466 

4637 

1470 

4648 

1473 

4658 

1476 

4669 

1480 

4680 

22 

1483 

4690 

1487 

4701 

1490 

4712 

1493 

4722 

1497 

4733 

1500 

4743 

1503 

4754 

1507 

4764 

1510 

4775 

1513 
4785 • 

23 

1517 

4796 

1520 

4806 

1523 

4817 

1526 

4827 

1530 

4837 

1533 

4848 

1536 

4858 

1539 

4868 

1543 

4879 

1546 

4889 

24 

1549 

4899 

1552 

4909 

1556 

4919 

1559 

4930 

1562 1565 

4940 4950 
* 

1568 

4960 

1572 

4970 

1575 

4980 

1578 

4990 


The fiwt significant figure and the position of the decimal point must 
be determined in accordam-e with the instructions givmt. 

K. 32 



SQUAB£ BOOTS 


m 



(i 

1 

2 

3 

4 

5 

6 

7 

8 

9 

25 

1581 

1584 

1587 

1591 

1594 

1597 

1600 

1603 

1606 

1609 


5000 

5010 

5020 

5030 

5040 

5050 

5060 

5070 

5079 

5089 

26 

1612 

1616 

1619 

1622 

1625 

1628 

1631 

1634 

1637 

1640 


5099 

5109 

5119 

5128 

5138 

5148 

5158 

5167 

5177 

5187 

27 

1643 

1646 

1649 

1652 

1655 

1658 

1661 

1664 

1667 

1670 

r 

5196 

5206 

5215 

5225 

5235 

5244 

5254 

5263 

5273 

5282 

28 

1673 

1676 

1679 

1682 

1685 

1688 

1691 

1694 

1697 

1700 


5292 

5301 

5310 

5320 

5329 

5339 

5348 

5357 

5367 

5376 

29 

1703 

1706 

1709 

1712 

1715 

1718 

1720 

1723 

1726 

1729 


5385 

5394 

5404 

5413 

5422 

5431 

5441 

5450 

5459 

5468 

30 

1732 

1735 

1738 

1741 

1744 

1746 

1749 

1752 

1755 

1758 


5477 

5486 

5495 

5505 

5514 

5523 

5532 

5541 

5550 

5559 

31 

1761 

1764 

1766 

1769 

1772 

1775 

1778 

1780 

1783 

1786 . 

{ 

5568 

5577 

5586 

5595 

5604 

5612 

5621 

5630 

5639 

5648 

32 

1789 

1792 

1794 

1797 

1800 

1803 

1806 

1808 

1811 

1814 


5657 

5666 

5675 

5683 

5692 

5701 

5710 

5718 

5727 

5736 

33 

1817 

1819 

1822 

1825 

1828 

1830 

1833 

1836 

1838 

1841 * 


5745 

5753 

5762 

5771 

5779 

5788 

5797 

5805 

5814 

5822 

34 

1844 

1847 

1849 

1852 

1855 

1857 

1860 

1863 

1865 

1868 


5831 

5840 

5848 

5857 

5865 

5874 

5882 

5891 

5899 

5908 

35 

1871 

1873 

1876 

1879 

1881 

1884 

1887 

1889 

1892 

1895 


5916 

5925 

5933 

5941 

5950 

5958 

5967 

5975 

5983 

5992 

36 

1897 

1900 

1903 

1905 

1908 

1910 

1913 

1916 

1918 

1921 


6000 

6008 

6017 

6025 

6033 

6042 

6050 

6058 

6066 

6075 

37 

1924 

1926 

1929 

1931 

1934 

1936 

1939 

1942 

1944 

1947 


6088 

6091 

6099 

6107 

6116 

6124 

6132 

6140 

6148 

6156 

38 

1949 

1952 

1954 

1957 

1960 

1962 

1965 

1967 

1970 

1972 


6164 

6173 

6181 

6189 

6197 

6205 

6213 

6221 

6229 

6237 

39 

1975 

1977 

1980 

1982 

1985 

1987 

1990 

1992 

1995 

1997 


6245 

6253 

6261 

6269 

6277 

6285 

6293 

6301 

6309 

6317 


The first signifieant figure and the position of the decimal point must 
be determined in accordance with the instructions given. 



SQUARE BOOTS 


499 



0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

40 

2000 

6325 

2002 

6332 

2005 

6340 

2007 

6348 

2010 

6356 

2012 

6364 

2015 

6372 

2017 

6380 

2020 

6387 

2022 

6395 

41 

2025 

6 - i 03 

2027 

6411 

2030 

6419 

2032 

6427 

2035 

6434 

2037 

6442 

2040 2042 
6450 6458 

2045 

6465 

2047 

6473 

42 

2049 

6481 

2052 

6488 

2054 

6496 

2057 

6504 

2056 

6512 

2062 

6519 

2064 

6527 

2066 

6535 

2069 

6542 

2071 - 

6550 

43 

2074 - 

6557 

2076 

6565 

2078 

6573 

2081 

6580 

2083 

6588 

2086 

6595 

2088 

6603 

2090 

6611 

2093 

6618 

2095 

6626 

44 

2098 

6633 

2100 

6641 

2102 

6648 

2105 

6656 

2107 

6663 

2110 

6671 

2112 

6678 

2114 

6686 

2117 

6693 

21 

6701 

45 

2121 

6708 

2124 

6716 

2126 

6723 

2128 

6731 

2131 

6738 

2133 

6745 

2135 

6753 

2138 

6760 

2140 

6768 

2142 ' 

6775 

46 

2145 

6782 

2147 

6790 

2149 

6797 

2152 

6804 

2154 

6812 

2156 

6819 

2159 

6826 

2161 

6834 

2163 

6841 

2166 

6848 

41 

2168 

68:,6 

2170 

6863 

2173 

6870 

2175 

6'877 

217 "^ 

6‘’85 

2179 

6892 

2182 

6899 

2184 

6907 

2186 

6914 

2189 

6921 

48 

2191 

6928 

2193 

6935 

2195 

6943 

2198 

6950 

2200 

6957 

2202 

6964 

2205 

6971 

2207 

6979 

2209 

6986 

2211 

6993 

49 

2214 

7000 

2216 

7007 

2218 

7014 

2220 

7021 

2223 

7029 

2225 

7036 

2227 

7043 

2229 

7050 

2232 

7057 

2234 

7064 

50 

2236 

7071 

2238 

7078 

2241 

7085 

2243 

7092 

2245 

7099 

2247 

7106 

2249 

7113 

2252 

7120 

2254 

7127 

2256 

7134 

51 

2258 

7141 

2261 

7148 

2263 

7155 

2265 

7162 

2267 

7169 

2269 

7176 

2272 

7^83 

2274 

7190 

2276 

7197 

2278 

7204 

52 

2280 

7211 

2283 

7218 

2285 

7225 

2287 

7232 

2289 

7239 

2291 

7246 

2293 

7253 

2296 

7259 

2298 

7266 

2300 

7273 

53 

2302 

7280 

2304 

7287 

2307 

7294 

2309 

7301 

2311 

7398 

2313 

7314 

2315 

7321 

2317 

7328 

2319 

7335 

2322 

7342 

54 

2324 

7348 

2326 

7355 

2328 

7362 

2330 

7369 

2332 

7376 

2335 2337 
7382 7389 

<* 

2339 

7396 

2341 

7403 

2343 

7409 


The first sigfnifieant figure and the position of the decimal point must 
be determined in uccordanee with the instructions given. 





500 SQUA&E SOOTS 



0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

» 

2345 

2347 

2349 

2352 

2:154 

2:156 

2358 

2360 

2362 

2364 


7416 

7423 

7430 

7436 

7443 

7450 

7457 

7463 

7470 

7477 

56 

2366 

2369 

2371 

2373 

2375 

2377 

2379 

2381 

2383 

2385 


7483 

7490 

7497 

7503 

7510 

7517 

7523 

7530 

7537 

7543 

57 

2387 

2390 

2392 

2394 

2396 

2398 

2400 

2402 

2404 

2406 


7550 

7556 

7563 

7570 

7576 

7583 

7589 

7596 

7603 

7609 

58 

2408 

2410 

2412 

2415 

2417 

2419 

2421 

2423 

2425 

2427 


7616 

7622 

7629 

7635 

7642 

7649 

7655 

7662 

7668 

7675 

59 

2429 

2431 

243:1 

2435 

2437 

2439 

2441 

2443 

2445 

2447 


7681 

7688 

7694 

7701 

7707 

7714 

7720 

7727 

7733 

7740 

60 

2449 

2452 

2454 

2456 

2458 

2460 

2462 

2464 

2466 

2468 


7746 

7752 

7759 

7765 

7772 

7778 

7785 

7791 

7797 

7804 

61 

2470 

2472 

2474 

2476 

2478 

2480 

2482 

2484 

2486 

24G8 


7810 

7817 

7823 

7829 

7836 

7842 

7849 

7855 

7861 

7868 

62 

2490 

2492 

<2494 

2496 

2498 

2500 

2502 

2504 

2506 

2508 


7874 

7880 

7887 

7893 

7899 

7906 

7912 

7918 

7925 

7931 

63 

2510 

2512 

2514 

2516 

2518 

2520 

2522 

2524 

2526 

2528 


7937 

7944 

7950 

7956 

7962 

7969 

7975 

7981 

7987 

7994 

64 

2530 

2632 

2534 

2536 

2538 

2540 

2542 

2544 

2546 

2548 


8000 

8006 

8012 

8019 

8025 

8031 

8037 

8044 

8050 

8056 

65 

2550 

2551 

2553 

2555 

2557 

2559 

2561 

2563 

2565 

2567 


8062 

8068 

8075 

8081 

8087 

8093 

8099 

8106 

8112 

8118 

66 

2569 

2571 

2573 

2575 

2577 

2579 

2581 

258:i 

2585 

2587 


8124 

8130 

8136 

8142 

8149 

8155 

8161 

8167 

8173 

8179 

67 

2588 

2590 

2592 

2594 

2596 

2598 

2600 

2602 

2604 

2606 


8185 

8191 

8198 

3204 

8210 

8216 

8222 

8228 

8234 

8240 

68 

2608 

2610 

2612 

2613 

2615 

2617 

2619 

2621 

2623 

2625 


8246 

8252 

8258 

8264 

8270 

8276 

8283 

8289 

8295 

8301 

69 

2627 

2629 

2^1 

2632 

26:14 

2636 

26:18 

2640 

2642 

2644 


8307 

8313 

8319 

8325 

•' *" 

8331 

8:137 

8343 

8349 

8355 

8361 


The first significant figure and the position of the decimal point must 
be determined in accordance with the instructions given. 



BQS4BS«00f| 501 



0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

70 

2646 

S367 

2648 

8373 

2650 

8379 

2651 

8385 

2653 

8390 

2655 

8396 

2657 

8402 

265!) 

8408 

2661 

8414 

2663 

8420 

71 

2665 

8426 

2666 

8432 

2668 

8438 

2670 

8444 

2672 

8450 

2674 

8456 

2676 

8462 

2678 

8468 

2680 

8473 

2681 

8479 

72 

2683 

8485 

2685 

8491 

2687 

8497 

2689 

8503 

2691 

8509 

2693 

8515 

2694 

8521 

2696 

8526 

2698 

8532 

2700 

8538 

73 

2702 

8544 

2704 

8550 

2706 

8556 

2707 

8562 

2709 

8567 

2711 

8573 

2713 

8579 

2715 

8585 

2717 

8591 

2718 

8597 

74 

2720 

8602 

2722 

8608 

2724 

8614 

2726 

8620 

2728 

8626 

2729 

8631 

2731 

8637 

2733 

8643 

2735 

8649 

2737 

8654 

75 

273!) 

8660 

2740 

8666 

2742 

8672 

2744 

8678 

2746 

8683 

2748 

8689 

2750 

8695 

2751 

8701 

2753 

8706 

2755 

8712 

76 

2757 

8718 

2759 

8724 

2760 

8729 

2762 

8735 

2764 

8741 

2766 
87': 6 

2768 

8752 

2769 

8758 

2771 

8764 

2773 

8769 

77 

2775 

8775 

2777 

8781 

2778 

8786 

2780 

87!)2 

2782 

8798 

2784 

8803 

2786 

8809 

2787 

8815 

2789 

8820 

2791 

8826 

78 

2793 

8832 

2795 

8837 

2796 

8843 

2798 

8849 

2800 

8854 

2802 

8860 

2804 

8866 

2805 

8871 

2807 

8877 

2809 

8883 

79 

2811 

8888 

2812 

8894 

2814 

8899 

2816 

8905 

2818 

8911 

2820 

8916 

2821 

8922 

2823 

8927 

2825 

8933 

2827 

8939 

80 

2828 

8944 

2830 

8950 

2832 

8955 

2834 

8961 

2835 

8967 

2837 

8972 

283!) 

8978 

2841 

8983 

2843 

8989 

2844 

8994 

81 

2846 

9000 

2848 

9006 

2850 

9011 

2851 

9017 

2853 

9022 

2855 

9028 

2857 

9033 

2858 

9039 

2860 

9044 

2862 

9050 

82 

2864 

!)055 

2865 

9061 

2867 

9066 

2869 

9072 

2871 

9077 

2872 

9083 

2874 

9088 

2876 

9094 

2877 

9099 

2879 

9105 

83 

2881 

9110 

2883 

9116 

2884 

9121 

2886 

9127 

2888 

9132 

2800 2891 
9138 9143 

2893 

9149 

2895 

9154 

2897 

9160 

84 

2898 

!»165 

2900 

9171 

2902 

9176 

2903 

9182 

2905 

9187 

2907 

9192 

2909 

9198 

2910 

9203 

2912 

9209 

2914 

9214 


The first si^ifii^nt fi|^ure aii3 tlie i/o^iti<» of The tfeeimal point must 
be deterniined in aeeonlanee with the instrnetiong giren. 





502 


SQUARE BOOTO 



0 

1 - 

2 

3 

4 

5 

6 

7 

8 

9 

85 

2915 

2917 

2919 

2921 

2922 

2924 

2926 

2927 

2929 

2931 


9220 

9225 

9230 

9236 

9241 

9247 

9252 

9257 

9263 

9268 

86 

2933 

2934 

2936 

2938 

2939 

2941 

2943 

2944 

2946 

2948 


9274 

9279 

9284 

9290 

9295 

9301 

9306 

9311 

9317 

9322 

87 

2950 

2951 

2953 

2955 

2956 

2958 

2960 

2961 

2963 

2965 


9327 

9333 

9338 

9343 

9349 

9354 

9359 

9365 

9370 

9375 

88 

2966 

2968 

2970 

2972 

2973 

2975 

2977 

2978 

2980 

2982 


9381 

9386 

9391 

9397 

9402 

9407 

9413 

9?18 9423 

9429 

89 

2983 

2985 

2987 

2988 

2990 

2992 

2993 

2995 

2997 

2998 


9434 

9439 

9445 

9450 

9455 

9460 

9466 

9471 

9476 

9482 

90 

3000 

3002 

3003 

3005 

3007 

3008 

3010 

3012 

3013 

3015 


9487 

9492 

9497 

9503 

9508 

9513 

9518 

9524 

9529 

9534 

91 

3017 

3018 

3020 

3022 

3023 

3025 

3027 

3028 

3030 

3032 


9539 

9545 

9550 

9555 

9560 

9566 

9571 

9576 

9581 

9586 

92 

3033 

3035 

3036 

3038 

3040 

3041 

3043 

3045 

3046 

3048 


9592 

9597 

9602 

9607 

9612 

9618 

9623 

9628 

9633 

9638 

93 

3050 

3051 

3053 

3055 

3056 

3058 

3059 

3061 

3063 

3064 


9644 

9649 

9654 

9659 

9664 

9670 

9675 

9680 

9685 

9690 

94 

3066 

3068 

3069 

3071 

3072 

3074 

3076 

3077 

3079 

3081 


'9695 

9701 

9706 

9711 

9716 

9721 

9726 

9731 

9737 

9742 

95 

3082 

3084 

3085 

3087 

3089 

3090 

3092 

3094 

3095 

3097 


9747 

9752 

9757 

9762 

9767 

9772 

9778 

9783 

9788 

9793 

96 

3098 

3100 

3102 

3103 

3105 

3106 

3108 

3110 

3111 

3113 


9798 

9803 

9808 

9813 

9818 

9823 

9829 

9834 

9839 

9844 

97 

3114 

3116 

3118 

3119 

3121 

3122 

3124 

3126 

3127 

3129 


9849 

9854 

9859 

9864 

9869 

9874 

9879 

9884 

9889 

9894 

98 

3130 

3132 

3134 

3135 

3137 

3138 

3140 

3142 - 

3143 

3145 


9899 

9905 

9910 

9915 

9920 

9925 

9930 

9935 

9940 

9945 

99 

3146 

3148 

3150 

3151 

3153 

3154 

3156 

3158 

3159 

3161 


9950 

9955 

9960 

9965 

9970 

9975 

9980 

9985 

9990 

9995 


The first significant fi^re«and the position of the decimal point mUst 
be determined in accordance with the instructions given. 



BEGIPROGiltLS 


503 


01 23 4 5 6 . 7 89 


I'O I'OOOO 'eyoi -ysoi -9709 ‘9615 -9524 9434 ‘9346 '9259 '9174. 

n '9091 '9009 '8929 '8850 '8772 '8696 '8621 '8547 '8475 '8403 

1'2 '8333 '8264 '8197 '8130 '8065 '8000 '7937 '7874 '7813 '7752 

r3 '7692 '7634 '7576 '7519 '7463 '7407 '7353 '7299 '7246 '7194 

1'4 '7143 '7092 '7042 '6993 '6944, '6897 '6849 '6803 '6757 '6711 

1'5 '6667 '8623 '6579 '6536 '6494 '6452 '6410 '6369 '8329 '6289 

re '6250 '6211 '6173 '6135 '6098 '6061 '6024 '5988 '5952 '5917 

1'7 '5882 '5848 '5814 '5780 '5747 '5714 '5682 '5650 '5618 '5587 
1'8 '5556 '5525 '5495 '5464 '5435 '5405 '5376 '5348 '5319 '5291 

1'9 '5263 '5236 '5208 '5181 '5155 '5128 '5102 '5076 '5051 '5025 

2'0 '5000 '4975 ‘4950 '4926 '^902 '4878 '4854 '4831 '4808 '4785 

2'] '4762 '4739 '4717 '4695 ‘4673 •4851 '4630 '4608 '4587 '4566 

2'2 '4545 '4525 '4505 '4484 '4464 '4444 '4425 '4405 '4386 '4367 

2'3 '4348 '4329 '4310 '4292 '4274 '4255 '4237 '4219 '4202 '4184 

2'4 '4167 '4149 '4132 '4115 '4098 '4082 '4065 '4049 '4032 '4016 

2T) '4000 '3984 '3968 '3953 '3937 '3922 '3906 '3891 '3876 '3861 

2'8 '3846 '3831 '3817 •3802 '3788 '3774 '3759 '3745 '3731 '3717 

2 7 '3704 '3^90 '3676 '3863 '3650 '3636 '3623 '3610 '3597 '3584 

2'8 '3571 '3559 '3548 '3534 '3521 '3509 '3497 '3484 '3472 '3460 

2'9 '3448 '3436 '3425 '3413 '3401 '3390 '3378 '3387 '3358 '3344 

3'0 '3333 '3322 '3311 '3300 '3289 '3279 '3288 '3257 '3247 '3238 

3'1 '3228 '3215 '3205 '3195 '3185 '3175 •3J6r) -3155 'SUo '3135 

3'2 '3125 '3115 '3106 '3096 '3088 '3077 '3087 '3058 '3049 '3040 

3'3 '3030 '3021 '3012 '3003 '2994 '2985 •2976' 2987 '2959 '2950 


'2941 '2933 
'2857 '2849 
'2778 '2770 

'2703 '2895 
'2832 '2825 
'2564 '2558 

'2500 '2494 
'2439 '2433 


'2899 '2890 
'2817 '2809 
'2740 '2732 

'2867 '2660 
•2597 '2591 
•2532 '2525 

•2469 -2463 
•24'10 '2404 


•2874 '2865 
•2793 '2786 
•2717 '2710 

•2646 '2639 
•2577 '2571 
•2513 -2506 

•2451 2445 
•2392 -2387 



504 


R1CIPBOGAL8 


01 2 3 4 567 89 

4 2 -238] -2375 ‘2370 2364 ‘2358 2353 -2347 2342 2336 -2331 

4’3 2326 2320 2315 2309 2304 ‘2299 2294 2288 •‘2283 •‘2278 

4 4 “2273 ‘2268 2262 ‘2257 •2252 ‘2247 ^2242 ‘2237 2232 2227 

4 5 ‘2222 -2217 '2212 -2208 ‘2203 ‘2198 2193 '2188 ‘2183 ‘2179 

4 6 “2174 2169 '2165 2160 “2155 “2151 2146 ^21 41 “2137 “2132 

47 2128 “2123 2119 “2114 -2110 2105 ^lOl '2096 2092 2088 

4 8 2083 2079 2075 2070 2066 2062 2058 2053 ‘2049 2045 

4 9 2041 ‘2037 2033 2028 2024 2020 “2016 2012 2008 2004 

5^0 -2000 1996 1992 1988 1984 1980 1976 1972 1969 1965 

5‘1 ‘1961 •1957 1953 1949 1946 1942 1938 1934 1931 1927 

52 1923 1919 1916 1912 1908 1905 lOOl 1898 1894 1890 

5-3 -1887 -iSSS -1880 -1876 -1873 -1869 -1866 •]862 ‘1859 -1855 

5-4 ’1852 ^1848 1845 •]842 ^IHSS -1835 1832 •18‘28 -1825 -1821 

5^5 ’ISIS 1815 -1812 '1808 -1805 ‘1802 •1799 ’1795 -1792 •HBO 

5- 6 'HBe •1783 1779 '1776 -1773 •1770 '1767 1764 1761 -1757 

5 7 1754 •1751 '1748 -1745 1742 1739 1736 1733 1730 1727 

58 1724 1721 1718 1715 ’1712 1709 1706 1704 1701 1698 

5^9 1695 1692 '1689 -1686 1684 1681 1678 1675 1672 '1669 

‘ i'- .! 

6- 4) -166: 1664 1661 1658 1656 -1653 1650 1647 1645 1642 

61 1639 1637 '1634 1631 1629 1626 1623 1621 1618 1616 

6^2 1613 -1610 1608 1605 1603 1600 ’1597 1595 '1592 1590 

6-3 -1587 '1585 '1582 1580 ’1577 1575 -1572 ^1570 -1567 ‘1565 

6 4 1563 1560 '1558 1555 1553 ’1550 1548 1546 1543 1541 

6- 5 1538 1536 1534 1531 1529 15*27 1524 1522 ’1520 -1517 

6 6 •ISIS •ISIS 1511 1508 1506 1504 -1502 •UOO 1497 1495 

67 1493 1490 1488 ’1486 1484 •|48l 1479 1477 1475 1473 

68 1471 1468 1466 1464 1462 1460 1458 1456 -1453 1451 

69 1449 1447 1445 1443 1441 1439 1437 1435 1433 1431 

70 1429 1427 1425 14*22 14*20 ’UIB 1416 1414 1412 1410 

71 •1406 1404 1403 1401 1399 1397 1395 1393 1391 

7- 2 1389 1387 1385 1383 1381 1379 1377 1376 1374 1372 

7 3 1370 1368 1366 1364 1362 1361 1359 1357 1355 •ISSS 

74 1351 1350 1348 IS'W 1344 1342 1340 1339 1337 1335 



SEOiPBOCAU 


505 


01 2 3 4 567 89 

7 5 1333 -1332 1330 1328 1326 1325 1323 1321 1319 1318 

7 6 1316 1314 1312 1311 1309 1307 1305 1304 -1302 1300 

77 1299 1297 1295 1294 1292 1290 1289 1287 1285 1284 

7‘8 -1282 1280 1279 1277 1276 1274 1272 1271 1269 1267 

7 !) 1266 1264 1263 1261 1259 1258 1256 1255 -1253 1252 

80 1250 1248 1247 1245 7244 1242 1241 1239 1238 1236 

81 1235 1233 1232 1230 1229 1227 ‘ 1225 1224 1?22 1221 

8 2 1220 1218 1217 1215 1214 1212 7 211 7209 7 208 1206 

8‘3 7 205 1203 1202 7200 7199 7198 7196 ’mS 7193 1192 

8 4 7190 1189 1188 7186 1185 7183 7182 7181 7179 1178 

8'5 7176 1175 1174 1172 7171 1170 1168 1167 7166 1164 

8 6 7163 1161 1160 1159 7157 7156 1155 7153 7152 7151 

87 1149 1148 7147 1145 1144 7143 1142 7140 Ho9 7138 

8 8 7136 7135 7134 7133 7131 7130 7129 7127 7126 7125 

8- 9 7124 7122 7121 7120 7119 7117 7116 7115 7114 7112 

90 nil 7110 7109 7107 1106 7105 7104 7108 7101 7100 

97 7 099 7 098 7 096 7 095 7 094 7 093 7 092 7089 7 088 

9- 2 1087 7 086 7 085 7 083 7082 7 081 7080 7 079 7 078 7 076 

9’3 7 075 7 074 7 073 7 072 7 071 7070 7 068 7 067 7066 7 065 

9 4 7064 7 063 7 062 7 060 7059 7 058 7 057 7 056 7 055 1054 

9’5 7 053 7 052 7 050 1049 7048 7 047 7 046 1045 7 044 1043 

9-6 7 042 7 041 7040 7038 7 037 7 036 7 035 7 034 7 033 7 032 

9 7 703'. 7030 7 029 1028 7 027 7 026 7 025 7 024 7022 7 021 

9-8 7020 7019 7018 7017 7010 7015 7014 7013 7012 7011 

9-9 7 010 7 009 7 008 7 007 7 006 7 005 7 004 7 003 7002 7 001 





m'DEX 


Abscissa, axis of, 310. 

Absolute error, 53. 

Accuracy, 51. 

Aggregate expenditure method, 229. 
Aggiegative weighting, 222. 
Agricultural statistics, 66. 
Approximation, 56. 

Arithmetic average, 128 — 135. 

- of relatives, 214. 

__ weighted, 135— ]42. 
Association, 418. 

Averages, limitations of, 151. 

, method of moving, 357. 

of the first order, 149. 

— typical and descriptive, 149. 

Bar diagram, 276. 

Bar frequency diagram, 335. 

Base lim*. false. 323. 

Base shifting, 219. 

Biased errors, 54. 

Blank form, 43, 462. 

Business activity index, 235, 262, 206. 

*• Capital ” Index of Indian In- 
dustrial Activity, 257. 

Census of production, 482. 

— reports, 73. 

(Iiain base, 212. 

— relatives, 216. 

Choh e of averages, 150. 

measures of dispersion, 185. 

— questions, 43. 

Circles, 291. 

Classification, 83. 
t’oefficient, 36, 100. 

— of association, 420. 

— — Concurrent deviations, 395. 

— — correlation, 386. 

mean deviation, 173. 

quartile deviation, 184. 


skewness, 193. 

— . — variation, 184. 

Composite unit, 36. 

Correlation, assumptions of, 393. 

— by graphic method, 398.. 

Cost of living index numbers, 70, 
226—234, 249—254, 260, 265,. 
Cubes, 297. 

Data, primary and secondary, 39. 

, sele<'tion of representative, 43. 

Defiles, 124. 

Derivatives, subordinate, 99. 

— . Co-ordinate, 100. 

Deviation, mean, 171. 

— , quartile, 184. 

— , standard, 178. 

Diagrams, 270. 

Dispersion, measures of, 17”. 

Economic barometers, 443. 

Editing primary data, 51. 

secondary data, 57. 

Expectation 415. 

Factor reversal test, 224. 

Famil.v budget method, 230. 

Fisher’s Ideal formula, 223. 
Fixed base method, 210. 

Fluctuations — 
cyclical, 354, 360. 
seasonal, 354. 

Forecasting, 442. 

Frequency graphs, 331. 

Functions of statistician, 15. 

Oa I ton’s method of locating the 
median, 344. 

Gedbetric average, 142. 

of relatives, 214, 218. 

, weighte<l, 144. 



508 


\sm\ 


GrHpliH of (‘ontimioiis time series, 

— on Kation acM*le, 

Harnicmie average, .147. 

Histogram, 336. 

Historigram, 315. 

ladependeace, criterion of, 416. 

Inder numbers of prices, 2-i* 

241», 258—4260, 26:U-265. 

— — . reversibility of, 217. 

Scheme of the (iovt, of India, 

255. 

Indices of huHine.ss conditions, 235, 
266. 

— — industrial activity, 2^14, 256, 
262. 

— — production, 260 — 262, 265. 
Inertia of large numbers. 46. 
Interpolation, 428. 

Interpretation, 32. 

Lagrange’s formula, 440. 

Law of statistical regularity, 46. 
Logarithmic curves, .326. 

I.iOrenz curve, ISM 

Measurement of the national income 
of India, 476. 

Median, 117 — 124. 

— of relatives, 214. 

Mmle, 110—117. 

Modulus, 1 83. 

Newton ’s formula, 438. 

Normal curve of error, 340. 

Notation and terminology, 411. 

Ogive curve, 341. 

Official statistics, 62. 

Parabolic curve, fitting with a, 436. 
Percentiles, 124. 

Periodicity, 360. ^ 

Pictograms, 300. 

Possible error, 56. 

Probability, 45, 415. 

Probable error, 393. 


Quurtile deviation, 184. 

Qiiartilea, 124. 

Questions, choice of, 43. 
Questionnaire. 43, 466. 

Random sampling, 44. 

Range, 172. 

Kate, 100. 

Ratio, 100, 105. 

Rectangles, 285. 

RtH'tangular co-ordinates, 311. 
Relative error, 53. 

Reversibility of index »"imbers. 217. 

'ieasonal variations, 365. 

, S<‘<*tors, 203. 

j Selection of representative data, 43. 
; Short-time oscillations, 356. 

1 Skewness, 190. 
i Squares, 289. 

’ StaiMlard deviation, 178. 

Standardized death-rates, 152. 

I Stati.stics, definition of, 9, 12. 

— , distrust of, 26. 

- , functions of, 19. 

} — . main divisions of, 16. 

— , Trade, 72. 

I — , vital, 78. 

Stutisth*al inquiries, types of, 32. 

— material in India, 61. 

— methods, 10. 

Surveys, 479, 480. 

Tabulation, 89. 

Time reversal test, 223. 

Time series, 88, 353. 

I Trend, 854, 362. 

Units of measurement, 34. 

— , simple and composite, 36. 

Variance, 184. 

Wages, 70. 

Weighted average, 135 — 142, 

of relativeSt 222. 

Weights, explicit, 221. 

— , implicit, 220. 



CORRIGENDA 


Page 

86, Line 20, read\ for , after variety. 

121, Line 3, read them for it. 

122, Table 10, Col. 1, read 1-5 for 1-4. 

130, Line 3, add dx before = 

137, Line 9, read this sum for it. 

155, Ex. 7(c), read series is for serious. 

157, add after 9th itne^ (21) Find the Geometric mean of the abo\e series. 

174, Line 11, read 5m for 5. 

175, Line 1, read fr 

182, Table 25, CoU (e) read dx for d. 

Col. (f) read d'^x for d^ 
col. (g) read fd x for fd^ 

283, Line 11 from bottom, add is after It. 

315, Line 17, read graphs for diagrams. 

338, Line 19, read 25 for 24. 

340, Line 3, read 25 for 17. 

342, Last line, read for for from. 

349, Ex. 13, read according fr assording. 

351, Line 11, read 23 for 28. 

357, Line 17, add by after divided. 

381, Line 22, read a^ for 

390, Line 3, read V 112-66 for 112-66. 

417, Line 19, read (h) for {h) 

Line 23. reaJ 

419, Lines 10 & 13, read 60 for 15. 

423, Ex. 2, read Given for give. 

430, Line 24, read interpolated for intrefxilated. 

436, 8th line from bottom, ^ead Allahabad for India. 

438, 11th line from bottom, last figure, read - -25 for -25 

441, 4th line from bottom, read 20 for 10. 

3rd line from bottom, read 264 for 528. 

2nd line from bottom, read 5445 for 4863. 

1st line from bottom, read 41 for 39. 

442, Line 2, read 41 for 39. ^ 

444, Line 8, omit can. * 

446, Ex, 12, read Calculate for Calcutta. 

449, Ex. 23, aiM of after census. 

46Q, Line 19, aH 100 after which. 







